test
This is a collection of emotional databases. It is based on the HUMAINE deliverable D5c.
Table 1: Multimodal databases (Note: in Column 2 ‘audiovisual’ refers to speech and face unless otherwise indicated)
|
Identifier |
Modalities |
Emotional content |
Emotion elicitation methods |
Size |
Nature of material |
Language |
| HUMAINE Database Stage 1 (described in full and downloadable from www.emotion-research.net/download/pilot-db/) |
Audiovisual + gesture .avi files readable in ANVIL (see www.dfki.de/~kipp/anvil/ datafiles containing emotion labels, gesture labels, speech labels and FAPS all readable in ANVIL |
Wide range labelled for emotional content at 2 levels (i) Global labels applied across a whole emotion episode or 'clip'. There are 8 global descriptive categories applied - emotion words, emotion-related states, combination types, authenticity, core affect dimensions, context labels, key events: what the emotion is about etc, appraisal categories.(ii) Time-aligned emotion labels assigned using the TRACE set of programs designed at Queen's University Belfast. Raters use a mouse to trace his/her perceived impressions of the emotional state of the speaker in the clip continuously over time on a one dimensional axis (e.g. intensity, activation/arousal, valence, power). Traces are available on seven dimensions. |
Naturalistic (including some clips from Belfast Naturalistic Database, Castaway Reality TV) and induced material (from Belfast Sensitive Artificial Listener, Belfast Spaghetti Data, Belfast Activity Data, EmoTaboo) |
50 clips ranging from 5 seconds to 3 minutes (16 clips currently availablewith ethical clearance and labelled for emotional content, emotional context and some signs of emotion – speech signs for all 16, and gesture and FAPS for one exemplar clip) |
Mainly interactive discourse |
English and some French and Hebrew clips |
|
Belfast Naturalistic Database (Douglas-Cowie et al 2000, 2003) |
Audio- visual
|
Wide range |
Natural: 10-60 sec long ‘clips’ taken from television chat shows, current affairs programmes and interviews conducted by research team |
125 subjects; 31 male, 94 female |
Interactive unscripted discourse |
English |
|
Geneva Airport Lost Luggage Study (Scherer & Ceschi 1997; 2000) |
Audio-visual |
Anger, good humour, indifference, stress sadness |
Natural: unobtrusive videotaping of passengers at Geneva airport lost luggage counter followed up by interviews with passengers |
109 subjects |
Interactive unscripted discourse |
|
|
Chung (Chung 2000) |
Audio-visual |
Joy, neutrality, sadness (distress) |
Natural: television interviews in which speakers talk on a range of topics including sad and joyful moments in their lives |
77 subjects; 6 1 Korean speakers, 6 Americans |
Interactive unscripted discourse |
English and Korean |
|
SMARTKOM www.phonetik.uni-muenchen.de/Bas/BasMultiModaleng.html#SmartKom |
Audio-visual, ( +gestures) |
Joy,gratification,anger, irritation,helplessness, pondering, reflecting, surprise, neutral |
Human machine in WOZ scenario: solving tasks with system |
224 speakers; 4/5 minute sessions |
Interactive discourse |
German |
|
Amir et al. (Amir et al, 2000) |
Audio + physiological(EMG,GSR, Heart Rate, Temperature, Speech) |
Anger, disgust, fear, joy, neutrality, sadness |
Induced: subjects asked to recall personal experiences involving each of the emotional states |
140 subjects 60 Hebrew speakers 1 Russian speakers |
Non interactive, unscripted discourse |
Hebrew Russian |
|
SALAS database (http://www.image.ntua.gr/ermis/ IST-2000-29319, D09) |
Audio-visual |
Wide range of emotions/emotion related states but not very intense |
Induced: subjects talk to artificial listener & emotional states are changed by interaction with different personalities of the listener |
Pilot study of 20 subjects |
Interactive discourse Subjects unscripted Machine scripted |
English (Greek version being developed) |
|
ORESTEIA database (McMahon et al. 2003) |
Audiol + physiological (some visual data too) |
Stress, irritation, shock |
Induced: subjects encounter various problems while driving (deliberately positioned obstructions, dangers, annoyances ‘on the road’ |
29 subjects, 90min sessions per subject |
Non interactive speech: giving directions, giving answers to mental arithmetic etc |
English |
|
Belfast Boredom database (Cowie et al. 2003) |
Audio-visual |
Boredom |
Induced |
12 subjects: 30 minutes each |
Non interactive speech: naming objects on computer screen |
English |
|
XM2VTSDB multi-modal face database http://www.ee.surrey.ac.uk/Research/VSSP/xm2vtsdb/ |
Audio-visual |
None |
n/a |
295 subjects ; Video |
High quality colour images, 32 KHz 16-bit sound files, video sequences and a 3d Model + profiles ( left-profile and one right profile image per person, per session, a total of 2,360 images), scripted 4 sentences |
English |
|
ISLE project corpora (http://isle.nis.sdu.dk/, IST project IST-1999-10647) |
Audio-visual + gesture |
None |
n/a |
unclear |
|
|
|
Polzin (Polzin, 2000) |
Audio- visual (though only audio channel used) |
Anger, sadness, neutrality (other emotions as well, but in insufficient numbers to be used) |
Acted: sentence length segments taken from acted movies |
Unspecified no of speakers. Segment numbers 1586 angry, 1076 sad, 2991 neutral |
Scripted |
English |
|
Banse and Scherer (Banse and Scherer 1996) |
Audio- visual (visual info used to verify listener judgements of emotion) |
Anger (hot), anger (cold), anxiety, boredom, contempt, disgust, elation, fear (panic), happiness, interest, pride, sadness, shame |
Acted: actors were given scripted eliciting scenarios for each emotion , then asked to act out the scenario. |
12 (6 male, 6 female) |
Scripted: 2 semantically neutral sentences (nonsense sentences composed of phonemes from Indo-European languages) |
German |
Table 2: Speech databases
|
Identifier |
Emotional content |
Emotion elicitation methods |
Size |
Nature of material |
Language |
|
TALKAPILLAR (Beller, 2005) |
neutral, happiness, question, positive and negative surprised, angry, fear, disgust, indignation, |
Contextualised acting: actors asked to read semantically neutral sentences in range of emotions, but practised on emotionally loaded sentences beforehand to get in the right mood |
1 actor reading 26 semantically neutral sentences for each emotion (each repeated 3 times in different activation level : low,middle,high) |
Non interactive and scripted |
French |
|
Reading-Leeds database (Greasley et al., 1995; Roach et al., 1998, Stibbard 2001) |
Range of full blown emotions |
Natural: Unscripted interviews on radio/television in which speakers are asked by interviewers to relive emotionally intense experiences |
Around 4 ½ hours material |
Interactive unscripted discourse |
English |
|
France et al. (France et al., 2000) |
Depression, suicidal state, neutrality |
Natural: therapy sessions & phone conversations. Post therapy evaluation sessions were also used to elicit speech for the control subjects |
115 subjects: 48 females 67 males. Female sample: 10 controls (therapists), 17 dysthymic, 21 major depressed Male sample: 24 controls (therapists), 21 major depressed , 22 high risk suicidal |
Interactive unscripted discourse |
English |
|
Campbell CREST database, ongoing (Campbell 2002; see also Douglas-Cowie et al. 2003) |
Wide range of emotional states and emotion-related attitudes |
Natural: volunteers record their |
Target - 1000 hrs over 5 years |
Interactive unscripted discourse |
English Japanese Chinese |
|
Capital Bank Service and Stock |
Mainly negative - fear, anger, stress |
Natural: call center human-human interactions |
Unspecified (still being labelled) |
Interactive unscripted discourse |
English |
|
SYMPAFLY (as used by Batliner et al. 2004b) |
Joyful, neutral, emphatic, surprised, ironic, helpless, touchy, angry, panic |
Human machine dialogue system |
110 dialogues, 29.200 words (i.e. tokens, not vocabulary) |
Naïve users book flights using machine dialogue system |
German |
|
DARPA Communicator corpus (as used by Ang et al. 2002) See Walker et al. 2001 |
Frustration, annoyance |
Human machine dialogue system |
Extracts from recordings of simulated interactions with a call centre, average length about 2.75 words 13187 utterances in total of which 1750 are emotional: 35 unequivocally frustrated, 125 predominantly frustrated, 405 unequivocally frustrated or annoyed, 1185 predominantly frustrated or annoyed |
Users called systems built by various sites and made air travel arrangements over the phone |
English |
|
AIBO (Erlangen database) (Batliner et al. 2004a) |
Joyful, surprised, emphatic, helpless, touchy (irritated), angry, motherese, bored, reprimanding, neutral |
Human machine: interaction with robot |
51 german children, 51.393 words (i.e. tokens, not > vocabulary) English (Birmingham): 30 children, 5.822 words (i.e. > tokens, not> vocabulary) |
Task directions to robot |
German |
|
Fernandez et al. (Fernandez et al. 2000, 2003) |
Stress |
Induced: subjects give verbal responses to maths problems in simulated driving context |
Data reported from 4 subjects |
Unscripted numerical answers to mathematical questions |
English |
|
Tolkmitt and Scherer (Tolkmitt and Scherer, 1986) |
Stress (both cognitive & emotional) |
Induced: 2 types of stress (cognitive and emotional) were induced through slides. Cognitive stress induced through slides containing logical problems; emotional stress induced through slides of human bodies showing skin disease/accident injuries |
60 (33 male, 27 female) |
Partially scripted: subjects made 3 vocal responses to each slide within a 40sec presentation period - a numerical answer followed by 2 short statements. The start of each was scripted and subjects filled in the blank at the end, e.g. ‘Die Antwort ist Alternative …’ |
German |
|
Iriondo et al. (Iriondo et al., 2000) |
Desire, disgust, fury, fear, joy, surprise, sadness |
Contextualised acting: subjects asked to read passages written with appropriate emotional content |
8 subjects reading paragraph length passages (20-40mmsec each) |
Non interactive and scripted |
Spanish |
|
Mozziconacci (Mozziconacci, 1998) Note: database recorded at IPO for SOBUproject 92EA. |
Anger, boredom, fear, disgust, guilt, happiness, haughtiness, indignation, joy, rage, sadness, worry, neutrality |
Contextualised acting: actors asked to read semantically neutral sentences in range of emotions, but practised on emotionally loaded sentences beforehand to get in the right mood |
3 subjects reading 8 semantically neutral sentences (each repeated 3 times) |
Non interactive and scripted |
Dutch |
|
McGilloway (McGilloway, 1997; Cowie and Douglas-Cowie, 1996) |
Anger, fear, happiness, sadness, neutrality |
Contextualised acting: subjects asked to read passages written in appropriate emotional tone and content for each emotional state |
40 subjects reading 5 passages each |
Non interactive and scripted |
English |
|
Belfast structured Database An extension of McGilloway database above (Douglas- Cowie et al. 2000) |
Anger, fear, happiness, sadness, neutrality |
Contextualised acting: subjects read 10 McGilloway- style passages AND 10 other passages - scripted versions of naturally occurring emotion in the Belfast Naturalistic Database |
50 subjects reading 20 passages |
Non interactive and scripted |
English |
|
Danish Emotional Speech Database (Engberg et al., 1997) |
Anger, happiness sadness, surprise neutrality |
Acted |
4 subjects read 2 words, 9 sentences & 2 passages in a range of emotions |
Scripted (material not emotionally coloured) |
Danish |
|
Groningen ELRA corpus number S0020 (www.icp.inpg.fr/ELRA) this new link is working (date:16/01/2005):(www.elda.org/catalogue/en/speech/S0020.html) |
Database only partially oriented to emotion |
Acted |
238 subjects reading 2 short texts |
Scripted |
Dutch |
|
Berlin database (Kienast & Sendlmeier 2000; Paeschke & Sendlmeier 2000) |
Anger- hot, boredom, disgust, fear- panic, happiness, sadness-sorrow, neutrality |
Acted |
10 subjects (5 male, 5 female) reading 10 sentences each |
Scripted (material selected to be semantically neutral) |
German |
|
Pereira (Pereira, 2000) |
Anger (hot), anger (cold), happiness, sadness, neutrality |
Acted |
2 subjects reading 2 utterances each |
Scripted (1 emotionally neutral sentence, 4 digit number) each repeated |
English |
|
van Bezooijen (van Bezooijen, 1984) |
Anger, contempt disgust, fear, interest joy, sadness shame, surprise, neutrality |
Acted |
8 (4 male, 4 female) reading 4 phrases |
Scripted (semantically neutral phrases) |
Dutch |
|
Abelin (Abelin 2000) |
Anger, disgust, dominance, fear, joy, sadness, shyness, surprise |
Acted |
1 subject |
Scripted (semantically neutral phrase) |
Swedish |
|
Yacoub et al (2003) (data from LDC, www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2002S28 |
15 emotions Neutral, hot anger, cold anger, happy, sadness, disgust, panic, anxiety, despair, elation, interest, shame, boredom, pride, contempt |
Acted |
2433 utterances from 8 actors |
Scripted |
English |
Table 3: Databases of Faces
|
Identifier |
Emotional content |
Emotion elicitation methods |
Size |
Nature of material |
|
The AR Face Database (http://rvl1.ecn.purdue.edu/~aleix/aleix_face_DB.html) |
Smile, anger, scream neutral |
Posed |
154 subjects ( 82 male, 74 female) 26 pictures per person |
1 : Neutral, 2 Smile, 3 : Anger, 4 : Scream , 5 : left light on, 6 : right light on, 7 : all side lights on, 8 : wearing sun glasses, 9 : wearing sun glasses and left light on, 10 : wearing sun glasses and right light on, 11 : wearing scarf, 12 : wearing scarf and left light on, 13 : wearing scarf and right light on, 14 to 26 : second session (same conditions as 1 to 13) |
|
CVL Face Database (http://www.lrv.fri.uni-lj.si/facedb.html) |
Smile |
Posed |
114 subjects (108 male, 6 female) 7 pictures per person |
Different angles, under uniform illumination, no flash and with projection screen in the background |
|
The Psychological Image Collection at Stirling (http://pics.psych.stir.ac.uk/) |
Smile, surprise, disgust |
Posed |
Aberdeen: 116 subjs Nottingham scans: 100 Nott-faces-original: 100 Stirling faces:36 |
Contains 7 face databases of which 4 largest are: Aberdeen , Nottingham scans, Nott-faces-original, Stirling faces |
|
The Japanese Female Facial Expression (JAFFE) Database (http://www.kasrl.org/jaffe.html) |
Sadness, happiness, surprise, anger, disgust, fear, neutral |
Posed |
10 subjects 7 pictures per subject |
6 emotion expressions + 1 neutral posed by 10 Japanese female models |
|
CMU PIE Database (CMU Pose, Illumination, and Expression (PIE) database) (http://www.ri.cmu.edu/projects/project_418.html) |
Neutral, smile, blinking and talking |
Posed for neutral, smile and blinking 2 secs video capture of talking per person |
68 subjects |
13 different poses, 43 different illumination conditions, and with 4 different expressions. |
|
Indian Institute of Technology Kanpur Database (http://www.cse.iitk.ac.in/users/mayankv/face.htm) |
Sad, scream, anger, expanded cheeks and exclamation, eyes open-closed, wink |
Posed |
20 subjects |
Varying facial expressions, orientation and occlusions; degree of orientation is from 00 to 200 in both right and left direction, the similar angle variation are considered in case of head tilting; and also head rotations both in top and bottom are taken into account. All of these images are taken with and without glasses in constant background; for occlusions some portion of face is kept hidden and lightning variations are considered. |
|
The Yale Face Database (http://cvc.yale.edu/projects/yalefaces/yalefaces.html) |
Sad, sleepy, surprised |
Posed |
15 subjects |
One picture per different facial expression or configuration: center-light, w/glasses, happy, left-light, w/no glasses, normal, right-light, sad, sleepy, surprised, and wink |
|
CMU Facial Expression Database (Cohn-Kanade) (http://vasc.ri.cmu.edu//idb/html/face/facial_expression/index.html) |
Six of the displays were based on descriptions of prototypic emotions (i.e., joy, surprise, anger, fear, disgust, and sadness). |
Posed |
200 subjects |
Subjects were instructed by an experimenter to perform a series of 23 facial displays that included single action units (e.g., AU 12, or lip corners pulled obliquely) and combinations of action units (e.g., AU 1+2, or inner and outer brows raised). Subjects began and ended each display from a neutral face |
|
Caltech Frontal Face DB (http://www.vision.caltech.edu/html-files/archive.html) |
Unclear |
|
27 subjects 450 images in total |
Different lighting, expressions, backgrounds |
|
HumanScan BioID Face DB (http://www.humanscan.de/support/downloads/facedb.php) |
None |
n/a |
23 subjects |
contains 19 manual markup points: 0 = right eye pupil 1 = left eye pupil 2 = right mouth corner 3 = left mouth corner 4 = outer end of right eye brow 5 = inner end of right eye brow 6 = inner end of left eye brow 7 = outer end of left eye brow 8 = right temple 9 = outer corner of right eye 10 = inner corner of right eye 11 = inner corner of left eye 12 = outer corner of left eye 13 = left temple 14 = tip of nose 15 = right nostril 16 = left nostril 17 = centre point on outer edge of upper lip 18 = centre point on outer edge of lower lip 19 = tip of chin |
|
Oulu University Physics-Based Face Database (www.ee.oulu.fi/research/imag/color/pbfd.html) |
None |
n/a |
125 subjects |
All frontal images: 16 different camera calibration and illuminations |
|
None |
n/a |
20 subjects, 19-36 pictures per person |
Range of poses from profile to frontal views |
|
|
Olivetti Research (www.mambo.ucsc.edu/psl/olivetti.html) |
None |
n/a |
40 subjects, 10 pictures per person |
All frontal and slight tilt of head |
|
The Yale Face Database B (http://cvc.yale.edu/projects/yalefacesB/yalefacesB.html) |
None |
n/a |
10 subjects |
9 poses x 64 illumination conditions |
|
AT&T (formerly called ORL database) |
Smiling / not smiling |
Posed |
40 subjects |
10 images for each subject which vary lighting, glasses/no glasses, and aspects of facial expression broadly relevant to emotion –open/closed eyes, smiling/not smiling |



SocialCom 2012 workshop on: Exploring Stances in Interactions: Conceptual and Practical Issues in Social Signal Processing Research
