Homepage of the INTERSPEECH 2010 Paralinguistic Challenge - Age, Gender and Affect


October 15th, 2010

The size of the board per poster will be 180 cm wide and 210 cm high.


Bjoern Schuller (CNRS-LIMSI, France)
Stefan Steidl (FAU Erlangen-Nuremberg, Germany)
Anton Batliner (FAU Erlangen-Nuremberg, Germany)
Felix Burkhardt (Deutsche Telekom, Germany)
Laurence Devillers (CNRS-LIMSI, France)
Christian Mueller (DFKI, Germany)
Shrikanth Narayanan (University of Southern California, USA)

Sponsored by:

HUMAINE Association
Deutsche Telekom Laboratories

Officially started:

March 19th, 2010

*FINISHED* Winners of the INTERSPEECH 2010 Paralinguistic Challenge:
  • The Age Sub-Challenge Prize is awarded to:
    Brno University of Technology System for Interspeech2010 Paralinguistic Challenge
  • The Gender Sub-Challenge Prize is awarded to:
    Age and Gender Classification using Fusion of Acoustic and Prosodic Features
  • The Affect Sub-Challenge Prize is awarded to:
    Level of Interest Sensing in Spoken Dialog Using Multi-level Fusion of Acoustic and Lexical Evidence

    The organisers congratulate the winners and thank all participants for their outstanding contributions - Overall the best result could be reached by fusion of these.

    If you are interested in using the corpora of the Challenge outside this event, please contact Felix Burkhardt for the aGender corpus and Björn Schuller for the TUM AVIC corpus licenses and download.

*IMPORTANT* INTERSPEECH Presentation Guide Line and structure of the Special Session:
  • Poster size differs from general INTERSPEECH format:
    10 sets of poster board will be set up in room 301 where the session will be held. The size of the board per poster will be 180 cm wide and 210 cm high, which is slightly different from that of regular poster sessions (240W x 180H) because we use an optional supplier for them. The poster boards will be placed in the side and/or back space of the room to prevent from influencing the oral part of the session.
  • A short presentation AND a poster following INTERSPEECH poster guidelines has to be prepared by every accepted participant.
    After the organisers' introduction of the Challenge, the 10 accepted participants' papers shall be presented in 3 minute short presentations (30 min total) following a rough structure (~3 slides: Features, Classification, Results including the competition results). Slides must be uploaded prior to the session. These short presentations will be followed by 70 min of 10 parallel poster presentations for the same papers in the same room. The session will be concluded by 10 min summarisation of the Challenge by the organisers.
  • Accepted papers (in provisional order of presentation):
    The INTERSPEECH 2010 Paralinguistic Challenge
    Age and Gender Classification from Speech using Decision Level Fusion and Ensemble Based Techniques
    Level of Interest Sensing in Spoken Dialog Using Multi-level Fusion of Acoustic and Lexical Evidence
    Fuzzy Support Vector Machines for Age and Gender Classification
    Gender and Affect Recognition Based on GMM and GMM-UBM modeling with relevance MAP estimation
    Age Recognition Based on Speech Signals Using Weights Supervector
    Age and Gender Classification using Fusion of Acoustic and Prosodic Features
    Brno University of Technology System for Interspeech 2010 Paralinguistic Challenge
    Combining Five Acoustic Level Modeling Methods for Automatic Speaker Age and Gender Recognition
    Age and Gender Recognition Based on Multiple Systems - Early vs. Late Fusion
    Automatic Speaker Age and Gender Recognition in the Car for Tailoring Dialog and Mobile Services

*IMPORTANT* Challenge deadline prolongation in the sense of a secondary upload ability of results on the Test partition:
  • A second result upload
    can take place any time between the INTERSPEECH 2010 paper submission deadline and the deadline for the camera ready paper, once your original submission for the Challenge was accepted.
  • Please keep in mind
    that for the participation in the Challenge a paper contribution is mandatory. Also please make sure that you submit your paper to the Challenge timely. For this you have to select the review topics as "Special Session only" and the Special Session as "INTERSPEECH 2010 Paralinguistic Challenge". Your first result submission must be prior to the Paper Submission deadline.

*IMPORTANT* Errata - Bug fixes and Patches for download:
  • Sample rate aGender (April 1st, 2010)
    As opposed to the readme in the download, aGender's sample rate is 8 kHz. An accordingly changed readme is available here.

Baseline results are available:

Download the Paper on the Challenge.

Call for Papers as PDF.

*Get started:* License agreement aGender corpus for the dataset download (Age and Gender Sub-Challenges).
*Get started:* License agreement TUM AVIC corpus for the dataset download (Affect Sub-Challenges).
*Read about it:* Paper on the Challenge to be cited.
*Participate:* Result submission is now open.
*FAQ:* Frequently asked questions.

The Challenge

Most paralinguistic analysis tasks resemble each other not only by means of processing and ever-present data sparseness, but by lacking agreed-upon evaluation procedures and comparability, in contrast to more “traditional” disciplines in speech analysis; at the same time, this is a rapidly emerging field of research, due to the constantly growing interest on applications to human behaviour analysis, and technologies for human-machine communication and multimedia retrieval.

In these respects, the INTERSPEECH 2010 Paralinguistic Challenge shall help bridging the gap between excellent research on paralinguistic information in spoken language and low compatibility of results, by addressing three selected tasks. The aGender and the the TUM AVIC corpora will be provided by the organisers. The first consists of 46 hours of telephone speech, stemming from 954 speakers, and will serve to evaluate features and algorithms for the detection of speaker age and gender. The second features 2 hours of human conversational speech recording (21 subjects), annotated in different levels of interest. The corpus further features a transcription of spoken content with word boundaries by forced alignment, non-linguistic vocalisations, single annotator tracks, different well-defined chunkings, and the sequence of these. Both corpora are given with distinct definition of test, development, and training partitions, incorporating speaker independence and different miking, as needed in most real-life settings. Benchmark results of the most popular approaches will be provided.

Three Sub-Challenges are addressed:

  • In the Age Sub-Challenge, the age of speakers has to be determined in four different age groups as classification task. The Challenge measure will be the unweighted average recall per class to better compensate for imbalance between these four age groups. However, in the training and development sets also the actual speaker age in years is provided. This may be used as additional information for model construction or reporting of more precise results in submitted papers.
  • In the Gender Sub-Challenge, a three-class classification task has to be solved: apart from female/male, a third "gender" is added: children. While this is strictly speaking a combined age/gender task the reason is obvious: determining childrens' gender by speech is beyond this Challenge. As for the Age Sub-Challenge, the competition measure is unweighted average recall of these three classes.
  • Finally, the Affect Sub-Challenge asks for determination of speakers’ interest in ordinal representation in this year’s Challenge as opposed to last INTERSPEECH’s Emotion Challenge, which dealt with emotion in a broader sense. The competition measure will thus be the cross-correlation between the annotators' mean "Level of Interest" annotation ground truth and the regression prediction. However, as a secondary measure the mean linear error will be calculated. Here, participants will also be given the individual labelers' tracks for model construction.

All Sub-Challenges allow contributors to find their own features with their own classification algorithm. However, a standard feature set will be given per corpus that may be used. The labels of the test set will be unknown, and participants will have to stick to the definition of training, development, and test sets. They may report on results obtained on the development set, but have only two trials to upload their results on the test set, whose labels are unknown to them. Each participation will be accompanied by a paper presenting the results that undergoes peer-review. The organisers preserve the right to re-evaluate the findings, but will not participate themselves in the Challenge. Participants are encouraged to compete in multiple Sub-Challenges.

Overall, contributions using the provided or an equivalent database are sought in (but not limited to) the following areas:

  • Participation in any of the Sub-Challenges
  • Combined determination of paralinguistic information under mutual information exploitation
  • Novel features and algorithms for the detection of paralinguistic information
  • Novel corpora and evaluations for paralinguistic tasks

The results of the Challenge will be presented at a Special Session of Interspeech 2010 in Makuhari, Japan.
Prizes will be awarded to the Sub-Challenge winners and a best paper.
If you are interested and planning to participate in the Paralinguistic Challenge, or if you want to be kept informed about the Challenge, please send the organisers an e-mail to indicate your interest.

To get started: Please obtain the license agreement (aGender corpus) (TUM AVIC corpus) to get a password and further instructions for the dataset download. Please fill it out, sign it, and fax it, accordingly. After downloading the data you can directly start your experiments with the train and development sets. Once you found your best method you should write your paper for the special session. At the same time you can compute your results per instance and Sub-Challenge task on the test set once and upload them. We will then let you know your according performance result.

Paper on the Challenge: The introductive Paper on the Challenge provides extensive descriptions and baseline results. All participants will be asked to avoid repetitions of Challenge, data, or feature descriptions in their submissions, but include the following citation:

Schuller, B.; Steidl, S.; Batliner, A.; Burkhardt, F.; Devillers, L.; Mueller, C.; Narayanan, S.: “The Interspeech 2010 Paralinguistic Challenge”, Interspeech (2010), ISCA, Makuhari, Japan, 2010.

Result Submission: Below are the templates to mail us your results on the Test partitions in the WEKA ARFF file format (detailed information on the file format can also be found here). Note that exactly one prediction has to be provided for each instance and all instances have to be covered. Result confidences are optional. However, if provided, they should be in the interval [0.000;1.000] and sum up to 1 over classes per instance. The idea behind this is that we want to combine all final predictions of the participants in a ROVER or Ensemble Learning experiment to provide an impression of obtainable performance with “combined efforts”.

Result files need to be mailed to ZIP file format may be used. The result on the Test partitions will be mailed back to the sending address. Please use the same address as on the license agreement. Results on the Develop partitions will need to be calculated by yourself.

Templates (Confidences are optional! The files can be made with any text editor):

Age and Gender Sub-Challenge:

Either use the 7-class format clearly indicating to which (Age/Gender/Age+Gender) Sub-Challenges you intend to submit it:


@relation MySite_ISPC_(Age|Gender|Age+Gender)_Sub-Challenge
@attribute name string
@attribute assigned_class {1,2,3,4,5,6,7}
@attribute confidence_1 numeric
@attribute confidence_2 numeric
@attribute confidence_3 numeric
@attribute confidence_4 numeric
@attribute confidence_5 numeric
@attribute confidence_6 numeric
@attribute confidence_7 numeric


OR for the Age Sub-Challenge the 4-class format:


@relation MySite_ISPC_Age_Sub-Challenge
@attribute name string
@attribute assigned_class {C,Y,A,S}
@attribute confidence_C numeric
@attribute confidence_Y numeric
@attribute confidence_A numeric
@attribute confidence_S numeric


OR for the Gender Sub-Challenge the 3-class format:


@relation MySite_ISPC_Gender_Sub-Challenge
@attribute name string
@attribute assigned_class {x,f,m}
@attribute confidence_x numeric
@attribute confidence_f numeric
@attribute confidence_m numeric


Affect Sub-Challenge:

The assigned LOI has to be in [-1;1].


@relation MySite_ISPC_Affect_Sub-Challenge
@attribute name string
@attribute assigned_loi_norm numeric
@attribute confidence numeric


Paper Submission (all participants): Each contribution in the Challenge shall be accompanied by a paper submitted to the special session "INTERSPEECH 2010 Paralinguistic Challenge" with the following conditions:

  • The deadline for submission of the papers and results is the INTERSPEECH 2010 paper submission deadline: 30th of April 2010.
  • The papers will undergo the normal review process.
  • Papers shall not repeat the descriptions of database, labels, partitioning etc. of the aGender and TUM AVIC corpora but cite the introductive paper (cf. above).
  • Participants may contribute in all Sub-Challenges at a time.
  • A development set will allow for tests and results to be reported by the participants apart from their results on the official test set.
  • Papers may well report additional results on other databases.
  • An additional publication is planned that summarizes all results of the Challenge and results combination by ROVERING or ensemble techniques. However, this publication is assumed to be post INTERSPEECH 2010.

Frequently asked questions: you might find an answer to your questions here:
  • Q: I saw that the results for the challenge needed to be submitted in arff file format, but I am not familiar with this kind of file, how do I need to write the results file, is there any program that I can use or should I just write it like any other txt file?
    A: There is no special program - just use any text editor. Remember, confidences are optional. Thus, the principle is very easy: we need the unique identifier of the instance ("name" of the speech turn) and your prediction for this instance (the class for Age or Gender or the numeric value in [-1;1] for Affect).
  • Q: Can I use the order of instances also for TUM AVIC's Test partition?
    A: You cannot. While this information is given for the Train and Development partitions and can be used throughout model training and tests on Develop, it was decided to hide order of the Test partition instances within the Challenge.
  • Q: Can I use additional databases within the Challenge?
    A: You can - as long as these are well known and accesible to everybody.
  • Q: How can I find the association between the 7 age classes of the aGender corpus and the four different age groups needed for the classification task (Child/Young/Adult/Senior)?
    A: In general, the aGender group class (1-7) corresponds with the age class (c [child], y [youth], a [adult], s [senior]) like this (gender f (female), m (male), x (child)):
    1 -> c (x),
    2 -> y (f),
    3 -> y (m),
    4 -> a (f),
    5 -> a (m),
    6 -> s (f)
    7 -> s (m).
    Note that the given age in years might differ one year due to birthdays near the date of speech collection. There are six cases were youth stated an incorrect age, namely (2512, 2514, 2543, 2873, 3105, 3994). Nonetheless, we were assured by the (external) speaker recruiter that these n speakers are indeed youth. So, DECICIVE for the Challenge is the age group like indicated above, NOT the age in years. Thus, the age group can be handled either as combined age/gender task (classes 1-7) or as age group task independent of gender (c,y,a,s) for reporting results in your paper obtained on the development set. For the final submission of results any of these can be used, but we will use only the age group information for the competition in the Age Sub-Challenge (by mapping {1,...,7} to {c,y,a,s} as denoted).
  • Q: The word/phone level information is not available for the blind test set of TUM AVIC. If we want to use word alignment or transcripts for lexical info, should we run ASR to obtain that?
    A: Yes, you should. The Challenge aims at the "real-life" case, i.e. as if running in a system. As the Challenge is primarily audio-based (and not text-based), spoken content of the test set needs to be recognised by ASR rather than using perfect transcription - in the end even ASR of affective speech may be a challenge. However, for evaluation of best suited textual analysis methods for interest determination the development set may be used providing perfect transcription.
  • Q: Can we use the TUM AVIC corpus for training in the age and gender classification experiment on the Challenge?
    A: Yes, you can. The additional use of TUM AVIC for training within the Age- and Gender-Sub-Challenges and vice versa is allowed. Please make sure to describe this well in the paper.
  • Q: We would like to perform experiments with TUM AVIC to be reported in a Ph.D. dissertation. Do we need an extra license for these results to be reported in this format?
    A: Yes you do. However, results you will report in your INTERSPEECH paper with citation of your paper and the official Challenge and TUM AVIC corpus papers is permitted. We will try to provide a new license for use outside the Challenge, yet no earlier than after INTERSPEECH.
  • Q: May I participate in several Sub-Challenges?
    A: Yes, of course. Every site may participate in every Sub-Challenge. In how many Sub-Challenges one site participates is decided at the moment you submit your paper by the results you include. Please send us an additional mail to indicate which Sub-Challenges you participate in for us to know.
  • Q: Do I have to submit a paper in order to participate?
    A: Yes, the submission of a paper is mandatory. Please make sure to select the special session during the submission procedure. The prizes will be awarded during the Special Session in Makuhari.
  • Q: What are the deadlines?
    A: The only deadline is the official INTERSPEECH paper submission deadline on Friday, 30 April 2010.
  • Q: May I include results on other corpora in my paper?
    A: Yes, of course. As long as it fits the general focus these are of course very welcome.
  • Q: What are the other formalities with respect to the paper of mine?
    A: Please make sure to reference the official paper on the Challenge and avoid repetition of the general data and Challenge description.
  • Thank you and welcome to the Challenge!

More Information will follow on a regular basis in very short.

