Personal tools
You are here: Home SIGs and LIGs Speech SIG is15-compare

is15-compare

Homepage of the INTERSPEECH 2015 COMputational PARalinguistics challengE (ComParE 2015)

 

Computational Paralinguistics Challenge (ComParE), Interspeech 2015

Degree of Nativeness, Parkinson’s & Eating Condition



Last updated:

9 March 2015

Last addition:

Paper on the Challenge released AND additional meta-data provided to participants of the PC Sub-Challenge (please contact us if you should not have received the access information, but you registered for this Sub-Challenge).

Organisers:

Björn Schuller (University of Passau, Germany AND Imperial College London, UK)
Stefan Steidl (FAU Erlangen-Nuremberg, Germany)
Anton Batliner (TUM, Germany)
Florian Hönig (FAU Erlangen-Nuremberg, Germany)
Juan Rafael Orozco(FAU Erlangen-Nuremberg, Germany AND Universidad de Antioquia, Colombia)

Sponsored by:

Association for the Advancement of Affective Computing (AAAC)
audEERING UG (limited)
iHEARu



Officially started:

5 January 2015: Eating Condition Sub-Challenge
22 January 2015: All Sub-Challenges


Deadlines:

20 March 2015 Paper submission to INTERSPEECH 2015
05 June 2015 Final result upload
10 June 2015 Camera-ready paper




Baseline results are available:

Download the Paper on the Challenge.




Call for Participation as PDF.

*Get started:*
License Agreement (Degree of Nativeness) for the dataset download (Brave New Approach and Degree of Nativeness Sub-Challenges)
License Agreement (Parkinson's Condition) for the dataset download (Parkinson Condition Sub-Challenge)
License Agreement (Eating Condition) for the dataset download (Eating Condition Sub-Challenge)


*Read about it:* Paper on the Challenge to be cited.
*FAQ:* Frequently asked questions.



The Challenge

The Interspeech 2015 Computational Paralinguistics ChallengE (ComParE) is an open Challenge dealing with states of speakers as manifested in their speech signal’s acoustic properties. There have so far been six consecutive Challenges at INTERSPEECH since 2009 (cf. the repository), but there still exists a multiplicity of not yet covered, but highly relevant paralinguistic phenomena. Thus, we introduce three new tasks by the Degree of Nativeness Sub-Challenge, the Parkinson's Condition Sub-Challenge, and the Eating Condition Sub-Challenge. In addition, we present for the first time the Brave New Approach Sub-Challenge that is based upon the DN Sub-Challenge data, but encourages new ideas and interesting "alternative" approaches beyond sheer performance optimisation.

For all these tasks, the data are provided by the organisers. They comprise high diversity of speakers and different languages covered ((Non-native) English, Spanish, and German).

The BNA Sub-Challenge provides a new mode that is a sort of "open" challenge where the participants know the test labels; the task is thus not necessarily to obtain the highest possible performance for unknown test data but to come up with new ideas and interesting "alternative" approaches spanning the spectrum from "good old phonetic/linguistic approaches" to innovative ideas and paradigm shifts in conventional paralinguistic methods for this rather new and difficult problem of addressing degree of nativeness within a speaker- and item-independent cross-corpus setting. We will provide additional meta-data for this open challenge. A "best paper award" will be given within this new BNA Sub-Challenge; awarding will be based on the standard reviewing procedure of Interspeech plus additional reviewing by the organising committee. Participants of the BNA Sub-Challenge are invited to take part in the DN Sub-Challenge as described next; this way, we want to provide space for changed ways of thinking while allowing for comparability.

In the DN Sub-Challenge, the task is to assess the prosody of non-native English speech on a continuous scale. DN is a cross-corpus task, employing material from the AUWL and ISLE corpora for training, and parts of the C-AuDiT corpus for development of models. In AUWL, learners of English as a second language practiced pre-scripted dialogues. The material used here comprises 31 speakers (gender: 13 female, 18 male; age: 36.5 +- 15.3 years; native languages: 2 Arabic, 1 Brazilian Portuguese, 3 Chinese, 1 French, 16 German, 1 Hungarian, 4 Italian, 3 Japanese), 5.5 hours, 3732 speech files (423 distinct sentences/phrases). Each speech file was annotated by five phoneticians with respect to its prosody (sentence melody and rhythm) on a five-point scale ranging from (1) for normal to (5) for very unusual. The combined labels have an average of 1.8 and a standard deviation of 0.6. From ISLE, we used material comprising 36 speakers (gender: 11 female, 25 male; ; native languages: 20 German, 16 Italian), 0.3 hours, 158 speech files (5 distinct sentences). Prosody scores were collected in a similar manner (2.1 +- 0.5). To account for the cross-corpus setting, we provide lists for an optional 4-fold cross-validation that is both speaker- and text-independent. The development material taken from the C-AuDiT corpus comprises 58 speakers (gender: 31 female, 27 male, native languages: 26 German, 10 French, 10 Spanish, 10 Italian, 2 Hindi), 999 speech files (19 distinct sentences). Prosody scores were collected in a similar manner, except for using a 3-point scale (0.5 +- 0.3). Therefore, we use Spearman's rank correlation coefficient as the evaluation measure. Additional material that may but need not be used comprises: (a) the word sequence the learners were supposed to produce, which can be used as a transcription since recordings with word errors were excluded; and (b) a pronunciation dictionary with syllable boundaries and word accent positions. The databases are described in more detail in (can be downloaded here):

F. Hönig, A. Batliner, and E. Nöth, "Automatic Assessment of Non-Native Prosody - Annotation, Modelling and Evaluation," in Proc. of the International Symposium on Automatic Detection of Errors in Pronunciation Training (IS ADEPT), Stockholm, 2012, pp. 21–30.

As a novelty, the test material will be entirely "cross-corpus", i.e., not only speaker- and item-independent but also collected with other microphone and acoustics and labelled by other individuals. This way, we can realize a more realistic DN Sub-Challenge, within a setting that rather mirrors real life, using data that have not been processed and screened for years.

The PC Sub-Challenge employs audio files for training and development recorded by Universidad de Antioquia in Colombia. 50 patients with Parkinson’s Condition were recorded in a sound proof booth. The data comprises a total of 42 speech tasks per speaker including 24 isolated words, 10 sentences, one reading text, one monologue, and the rapid repetition of syllables. The total duration of the recordings is 1.4 hours. The age of the male patients ranges from 33 to 77 years and the age of the female patients from 44 to 75 years. The neurological state of the patients was evaluated by a neurologist expert according to the Unified Parkinson’s Disease Rating Scale (motor subscale: UPDRS-III. The values of the neurological evaluations performed over the patients range from 5 to 92. Details are found in (can be downloaded here):

J. Orozco-Arroyave, J. Arias-Londono, J. Vargas-Bonilla, M. González-Rátiva, and E. Nöth, “New spanish speech corpus database for the analysis of people suffering from parkinson’s disease,” in Proceedings of the 9th Language Resources and Evaluation Conference (LREC), 2014, pp. 342–347.

An additional test partition with labels unknown to the participants comprises further eleven subjects.

The EC Sub-Challenge features the iHEARu-EAT corpus with 30 subjects (15 female, 15 male) speaking partially while eating. The language of the corpus is German and all subjects are native speakers or have a close-to-native competence. Six food classes are included in the recordings under eating condition: crisps, biscuits, gummy bears, bananas, apples, and nectarines. Both read and spontaneous speech were recorded. The read speech comprises six pre-defined sentences from "The North Wind and the Sun". All in all, there are 1.4 k utterances and 2:53 hours of speech. In the Challenge, the task is to perform seven-way classification per utterance (one of the six food classes or "No Food"). The data are partitioned in a CV-training set, on which leave-one-speaker-out cross-validation can be performed using a provided script, and a test set. Class labels, anonymous speaker IDs, speaker age, and gender are provided for the training utterances, while no information is given for the test utterances. The data are described in:

S. Hantke, F. Weninger, R. Kurle, A. Batliner, and B. Schuller, “I hear you eat and speak: automatic recognition of eating condition and food type,” (to appear - will be distributed to participants of the Challenge).

The INTERSPEECH 2015 Computational Paralinguistics Challenge (ComParE) shall help bridging the gap between excellent research on paralinguistic information in spoken language and low compatibility of results.

Four Sub-Challenges are addressed:

  • The new Brave New Approach (BNA) Sub-Challenge seeks for interesting and promising alternative routes outside the conventional main-stream approaches in an open environment with known test labels and rich meta information.
  • In the Degree of Nativeness (DN) Sub-Challenge, the degree of nativeness has to be determined based on acoustics.
  • In the Parkinson's Condition (PC) Sub-Challenge, the degree of Parkinson’s condition has to be recognised based on speech analysis.
  • In the Eating Condition (EC) Sub-Challenge, eating condition and the type of consumed food (seven classes) of a speaker have to be determined for the first time.

The measure of competition will be Spearman Correlation for DN and PC, and Unweighted Accuracy for EC. We will provide scripts for doing optional cross-validation during model optimisation and training. Orthographic transcriptions of the sets will be known. All Sub-Challenges allow contributors to find their own features with their own machine learning algorithm. However, a standard feature set will be provided that may be used. Participants will have to stick to the definition of training, development, and test sets as given. They may report results obtained on the development sets, but have only five (EC) / ten (DN/PC) trials to upload their results on the test sets depending on the Sub-Challenges, whose labels are unknown to them. Each participation has to be accompanied by a paper presenting the results that undergoes the normal Interspeech peer-review and has to be accepted for the conference in order to participate in the Challenge. The organisers preserve the right to re-evaluate the findings, but will not participate themselves in the Challenge.

We encourage both - contributions aiming at highest performance w.r.t. the baselines provided by the organisers, and contributions aiming at finding new and interesting insights w.r.t. these data. Overall, contributions using the provided or equivalent data are sought for (but not limited to):

  • Participation in a Sub-Challenge
  • Contributions focussing on Computational Paralinguistics centred around the Challenge topics

The results of the Challenge will be presented at Interspeech 2015 in Dresden, Germany.
Prizes will be awarded to the Sub-Challenge winners. In addition, we plan a "best paper award" for the paper with the most innovative and/or interesting approach in the BNA Sub-Challenge.
If you are interested and planning to participate in the Computational Paralinguistics Challenge, or if you want to be kept informed about the Challenge, please send the organisers an e-mail to indicate your interest.




To get started: Please obtain the License Agreement (BNA and DN Sub-Challenges) or License Agreement (PC Sub-Challenge) or License Agreement (EC Sub-Challenge) to get a password and further instructions for the download of the datasets: Please fill it out for the dataset(s) you wish to obtain, and print, sign, scan, and email accordingly. The agreement(s) has/have to be signed by a permanent staff member.
After downloading the data you can directly start your experiments with the data sets. Once you found your best method you should write your paper for the Special Session. At the same time you can compute your results per instance and Sub-Challenge task on the test set and upload them: we will then let you know your performance result.

Paper on the Challenge: The introductory Paper on the Challenge provides extensive descriptions and baseline results. All participants will be asked to avoid repetitions of Challenge, data, or feature descriptions in their submissions - of course, they have to describe shortly the essentials of the databases dealt with - but include the following citation: :

Björn Schuller, Stefan Steidl, Anton Batliner, Simone Hantke, Florian Hönig, Juan Rafael Orozco-Arroyave, Elmar Nöth, Yue Zhang, Felix Weninger: "The INTERSPEECH 2015 Computational Paralinguistics Challenge: Nativeness, Parkinson's & Eating Condition", Proceedings INTERSPEECH 2015, ISCA, Dresden, Germany, 2015.

Result Submission: Opening soon...

Paper Submission (all participants): Please be reminded that a paper submission and at least one upload on the test set are mandatory for the participation in the Challenge. However, paper contributions within the scope are also welcome if the authors do not intend to participate in the Challenge itself. In any case, please submit your paper until 20 March 2015 (and final results by 05 June 2015) using the standard style info and respecting length limits, and submit to the regular submission system. However, as topic you should choose only this Special Session (Computational Paralinguistics Challenge 2015). Please note that

  • The deadline for submission of the papers and results is the INTERSPEECH 2015 paper submission deadline: 20 March 2015. Remaining result upload trials can be saved for new Challenge results until 05 June 2015.
  • The papers will undergo the normal review process.
  • Papers shall not repeat the descriptions of database, labels, partitioning etc. of the DN, PC, and EC Sub-Challenge corpora but cite the introductive paper and according papers for the databases (cf. also above):

    Björn Schuller, Stefan Steidl, Anton Batliner, Simone Hantke, Florian Hönig, Juan Rafael Orozco-Arroyave, Elmar Nöth, Yue Zhang, Felix Weninger: "The INTERSPEECH 2015 Computational Paralinguistics Challenge: Nativeness, Parkinson's & Eating Condition", Proceedings INTERSPEECH 2015, ISCA, Dresden, Germany, 2015.

  • Participants may contribute in all Sub-Challenges at a time.
  • A training and development partitioning will allow for tests and results to be reported by the participants apart from their results on the official test set.
  • Papers may well report additional results on other databases.
  • An additional publication is planned that summarises all results of the Challenge and results combination by ROVERING or ensemble techniques. However, this publication is assumed to be post INTERSPEECH 2015.



Frequently asked questions: you might find an answer to your questions here:
  • Q: Are there scripts to extract the ComParE feature set with openSMILE available?
    A: Please see here.
  • Q: Do I have to include test set results for the paper submission?
    A: Yes. However, you can update these in the final camera ready version. In any case, at least one result submission has to be made before the paper deadline.
  • Q: Can we submit one paper to Interspeech describing participation in several Sub-Challenges?
    A: This is possible - obviously the paper should read well and contain all necessary details, as it needs to pass peer-review as condition for successful Challenge participation.
  • Q: Can we submit more than one paper to Interspeech describing participation in different Sub-Challenges?
    A: This is possible - you can submit, e.g., separate papers for the participation in different Sub-Challenges or for different approaches in the same Sub-Challenge. Each one will undergo peer-review independently.
  • Q: Can we have richer meta-data for the databases, so that we can also look at other things?
    A: Items not available in the current download package will only be available after the Challenge. Then, however, more info will partially given.
  • Q: My results are below the baseline - does it make sense to submit?
    A: Of course it does. We do not know whether the baseline will be surpassed and different experiences with the tasks on the same dataset will be of interest. Please remember that all submissions to the Challenge go through the normal reviewing process. Although it is very likely that the reviewers do know - and take into account - the baselines, the criteria are the usual, i.e., scientific quality; surpassing any baseline - be this the one given for this challenge, or another one known from the literature - is just one of the criteria. A paper reporting results above the baseline, but poorly written, runs high risks *not* to be accepted; in contrast, a paper which is well written, contributing to our knowledge, but with results below the baseline, has high chances to be accepted.
  • Q: When is the deadline for submitting the results?
    A: You will need to submit results by 05 June 2015 prior to camera ready paper submission to INTERSPEECH as a result on test needs to be included in your final paper version if you want to compete for the Sub-Challenge awards. All except one (which has to be used before the paper deadline on 20 March 2015) of the five (EC) / ten (DN/PC) result submissions per Sub-Challenge and participant can be saved for submission as late as 05 June 2015.
  • Q: How will the data be distributed to the participants? Are you sending the test data at the same time with training and development partitions?
    A: Yes, we do. However, labels are only given for training and development partitions. Further, we are not sending the data - you will need to download the data. Please first download, print, and sign the license agreements - one per data set once the agreements are available (cf. above) - and scan and mail or fax these to the addresses given on the agreements. You will then receive an email with download instructions.
  • Q: Can I use additional databases within the Challenge?
    A: You can - as long as these are well known and accessible to everybody.
  • Q: May I participate in several Sub-Challenges?
    A: Yes, of course. Every site may participate in every Sub-Challenge and the five (EC) / ten (DN/PC) upload trials are per Sub-Challenge. In how many Sub-Challenges one site participates is decided at the moment you submit your paper by the results you include. Please send us an additional e-mail to indicate which Sub-Challenges you participate in for us to know.
  • Q: Do I have to submit a paper in order to participate?
    A: Yes, the submission and acceptance of a paper is mandatory. Please make sure to select the special event during the submission procedure. The prizes will be awarded during INTERSPEECH 2015 in Dresden/Germany.
  • Q: May I include results on other corpora in my paper?
    A: Yes, of course. As long as it fits the general focus these are of course very welcome.
  • Q: What are the other formalities with respect to the paper of mine?
    A: Please make sure to reference the official Paper on the Challenge and avoid repetition of the general data and Challenge description.
  • Thank you and welcome to the Challenge!



More Information will follow on a regular basis.



Document Actions
Powered by Plone

Portal usage statistics