Personal tools
You are here: Home SIGs and LIGs Speech SIG IS11 SS SAP4HRI


Interspeech 2011 special session: Speech and audio processing for human-robot interaction

Call for Participation
Interspeech 2011 special session:
Speech and audio processing for human-robot interaction


Laurence Devillers, LIMSI-CNRS (
Agnès Delaborde, LIMSI-CNRS (
Alexander Rudnicky, Carnegie Mellon University (

The field of human-robot interaction is attracting an increasing amount of interest from researchers making this an ideal time to highlight the work being done in human-robot spoken language interaction. Social interaction is characterized by a continuous and dynamic exchange of information-carrying signals. Producing and understanding these signals allow humans to communicate simultaneously on multiple levels. Such signals include: speech and non-speech sounds, gesture, facial expression and pose. Among these channels, vocal expression is best suited for communicating a rich variety of information; it is also the most natural modality for communicating meaning, emotion and personality. Vocal expression is characterized by a verbal component (language) and by a non-verbal component (prosody, intonation, hesitation).

Our current ability to model vocal communication is quite limited; spoken language systems, robots in our case, are able to communicate concrete meaning through language but their ability to detect (or for that matter generate) non-linguistic information streams is quite primitive. The ability to understand this information, and for that matter adapt generation to the goal of the communication and the characteristic of particular interlocutors, constitutes a significant aspect of natural interaction.

The purpose of this special session is to bring together researchers who are exploring vocal expression from different perspectives, including detection, modeling and generation. The focus of the session is on audio verbal and non-verbal cues required for the design of natural interaction between a human and a robot.

Special session topics may include, but are not limited to:
  • Speech recognition systems for HRI
  • Dialog systems for HRI
  • Automatic emotion detection from verbal and non-verbal cues
  • Automatic recognition of user personality in dialog
  • Multimodal speech/audio expression generation in robots
  • Perception-action loops in robots
  • Back-channel generation and understanding
  • Interpretation of prosodic information
  • Timing in discourse
  • Integrated models of vocal communication
  • Natural Human-Robot Interaction (HRI)

The submission is the same that for regular interspeech paper: same deadline at the end of March, same format and review procedure.
Document Actions
Powered by Plone

Portal usage statistics