Personal tools
You are here: Home Wiki Databases

This is a collection of emotional databases. It is based on the HUMAINE deliverable D5c.

Table 1: Multimodal databases  (Note: in Column 2 'audiovisual' refers to speech and face unless otherwise indicated)



Emotional  content

Emotion elicitation methods


Nature of material


HUMAINE Database

Stage 1 (described in full and  downloadable from

Audiovisual + gesture

.avi files readable in ANVIL (see datafiles containing emotion labels, gesture labels, speech labels and FAPS all readable in ANVIL

Wide range labelled for emotional content at 2 levels

 (i) Global labels applied across a whole emotion episode or 'clip'. There are 8 global descriptive categories applied - emotion words, emotion-related states, combination types, authenticity, core affect dimensions, context labels, key events: what the emotion is about etc, appraisal categories.
(ii) Time-aligned emotion labels assigned using the TRACE set of programs designed at Queen's University Belfast. Raters use a mouse to trace his/her perceived impressions of the emotional state of the speaker in the clip continuously over time on a one dimensional axis (e.g. intens
ity, activation/arousal, valence, power). Traces are available on seven dimensions.

Naturalistic (including some clips from Belfast Naturalistic Database, Castaway Reality TV) and induced material (from Belfast Sensitive Artificial Listener, Belfast Spaghetti Data, Belfast Activity Data, EmoTaboo)

50 clips ranging from 5 seconds to 3 minutes (16 clips currently availablewith ethical clearance and labelled for emotional content, emotional context and some signs of emotion – speech signs for all 16, and gesture and FAPS for one exemplar clip)

Mainly interactive discourse

English and some French and Hebrew clips

Belfast  Naturalistic Database (Douglas-Cowie et al 2000, 2003)

Audio- visual

Wide range

Natural:  10-60 sec long 'clips' taken from television chat shows, current affairs programmes and interviews conducted by research team

125 subjects; 31 male, 94 female

Interactive unscripted discourse


RECOLA - REmote COLlaborative and Affective interactions - Database (Ringeval et al. 2013);

Multimodal: audio (headset and webcam), video (720p webcam), EDA, ECG (Biopac recording unit)

Wide range of socio-affective behaviors labelled by 6 external annotators at 2 levels, using the ANNEMO (ANNotating EMOtions) tool designed at the DIVA group at Université de Fribourg:

 (i) Descrete annotations (social information) perfomed across the first 5 min. of each collaborative interaction. There are 5 descriptive categories (7 points Likert-scale): agreement, dominance, engagement, performance and rapport.
(ii) Time-continuous emotion annotations of arousal and valence performed separately and also for the first 5 min. of interaction.

Natural:  socio-affective behaviors are spontaneously triggered during remote collaborative interactions in dyads; participants had to discuss their findings on survival task in order to find an agreement.

34 subjects; 14 male, 20 female

Remote (dyadic) collaborative interactions


Geneva Airport  Lost Luggage Study (Scherer & Ceschi 1997; 2000)


Anger, good humour, indifference, stress sadness

Natural: unobtrusive videotaping of passengers at Geneva airport lost luggage counter followed up by interviews with passengers

109 subjects

Interactive unscripted discourse

Chung  (Chung 2000)


Joy, neutrality, sadness (distress)

Natural: television interviews in which speakers talk on a range of topics including sad and joyful moments in their lives

77 subjects; 6 1 Korean speakers, 6 Americans

Interactive unscripted discourse

English and Korean


Audio-visual, ( +gestures)

Joy,gratification,anger, irritation,helplessness, pondering, reflecting, surprise, neutral

Human machine in WOZ scenario: solving tasks with system

224 speakers; 4/5 minute sessions

Interactive discourse


Amir et al. (Amir et al, 2000)

Audio + physiological(EMG,GSR, Heart Rate, Temperature, Speech)

Anger, disgust, fear, joy, neutrality, sadness

Induced: subjects asked to recall personal experiences involving each of the emotional states

140 subjects

60 Hebrew speakers  1 Russian speakers

Non interactive, unscripted discourse

Hebrew Russian

SALAS  database ( IST-2000-29319, D09)


Wide range of emotions/emotion related states but not very intense

Induced: subjects talk to artificial listener & emotional states are changed by interaction with different personalities of the listener

Pilot study of 20 subjects

Interactive discourse

Subjects unscripted

Machine scripted

 English (Greek version being developed)

ORESTEIA database (McMahon et al. 2003)

Audiol + physiological

(some visual data too)

Stress, irritation, shock

Induced: subjects encounter various problems while driving (deliberately positioned obstructions, dangers, annoyances ‘on the road’

29 subjects, 90min sessions per subject

Non interactive speech: giving directions, giving answers to mental arithmetic etc


Belfast Boredom database (Cowie et al. 2003)




12 subjects: 30 minutes each

Non interactive speech: naming objects on computer screen


XM2VTSDB multi-modal face  database




295 subjects ;   Video

High quality colour images, 32 KHz 16-bit sound files, video sequences and a 3d Model + profiles ( left-profile and one right profile image per person, per session, a total of 2,360 images), scripted 4 sentences


ISLE project corpora (, IST project IST-1999-10647)


+ gesture





Polzin (Polzin, 2000)

Audio- visual (though only audio channel used)

Anger, sadness, neutrality (other emotions as well, but in insufficient numbers to be used)

Acted: sentence length segments taken from acted movies

Unspecified no of speakers.  Segment numbers 1586 angry, 1076 sad, 2991 neutral



Banse and Scherer (Banse and Scherer 1996)

Audio- visual (visual info used to verify listener judgements of emotion)

Anger (hot), anger (cold), anxiety, boredom, contempt, disgust, elation, fear (panic), happiness, interest, pride, sadness, shame

Acted: actors were given scripted eliciting scenarios for each emotion , then asked to act out the scenario.

12 (6 male, 6 female)

Scripted: 2 semantically neutral sentences (nonsense sentences composed of phonemes from Indo-European languages)


Table 2: Speech databases


Emotional  content

Emotion elicitation methods


Nature of material


TALKAPILLAR (Beller, 2005)

 neutral, happiness, question, positive and negative surprised, angry, fear, disgust, indignation,
sad ,bore

Contextualised acting: actors asked to read semantically neutral sentences in range of emotions, but practised on emotionally loaded sentences beforehand to get in the right mood

1 actor reading 26 semantically neutral sentences  for each emotion (each repeated 3 times in different activation level : low,middle,high)

Non interactive and scripted  


Reading-Leeds database (Greasley et al., 1995;  Roach et al., 1998, Stibbard 2001)

Range of full blown emotions

Natural:  Unscripted interviews on radio/television in which speakers are asked by interviewers to relive emotionally intense experiences

Around 4 ½ hours material

Interactive unscripted discourse


France et al. (France et al., 2000)

Depression, suicidal state, neutrality

Natural: therapy sessions & phone conversations. Post therapy evaluation sessions were also used to elicit speech for the control subjects

115 subjects: 48 females 67 males.

Female sample: 10 controls (therapists), 17 dysthymic, 21 major depressed

Male sample: 24 controls (therapists), 21 major depressed , 22 high risk suicidal

Interactive unscripted discourse


Campbell CREST database, ongoing (Campbell 2002; see also Douglas-Cowie et al. 2003)

Wide range of emotional states and emotion-related attitudes

Natural: volunteers record their
domestic and social spoken interactions for extended periods throughout the day

Target - 1000 hrs over 5 years

Interactive unscripted discourse




Capital Bank Service and Stock
Exchange Customer Service (as used by Devillers & Vasilescu 2004)

Mainly negative - fear, anger, stress

Natural: call center human-human interactions

Unspecified (still being labelled)

Interactive unscripted discourse


SYMPAFLY (as used by Batliner et al. 2004b)

 Joyful, neutral, emphatic, surprised, ironic, helpless, touchy, angry, panic

Human machine dialogue system

110 dialogues, 29.200 words (i.e. tokens, not vocabulary)

Naïve users book flights using machine dialogue system


DARPA Communicator corpus (as used by Ang et al. 2002)

See Walker et al. 2001

Frustration, annoyance

Human machine dialogue system

Extracts from recordings of simulated interactions with a call centre, average length about 2.75 words

13187 utterances in total of which 1750 are emotional: 35 unequivocally frustrated, 125 predominantly frustrated, 405 unequivocally frustrated or annoyed, 1185 predominantly frustrated or annoyed

Users called systems built by various sites and made air travel arrangements over the phone


AIBO (Erlangen database) (Batliner et al. 2004a)

Joyful, surprised, emphatic, helpless, touchy (irritated), angry, motherese, bored, reprimanding, neutral

Human machine: interaction with robot

51 german children, 51.393 words (i.e. tokens, not > vocabulary) English (Birmingham): 30 children, 5.822 words (i.e. > tokens, not> vocabulary)

Task directions to robot


Fernandez et al. (Fernandez et al. 2000, 2003)


Induced: subjects give verbal responses to maths problems in simulated driving context

Data reported from 4 subjects 

Unscripted   numerical answers to mathematical questions


Tolkmitt and Scherer  (Tolkmitt and Scherer, 1986) 

Stress (both cognitive & emotional)

Induced: 2 types of stress (cognitive and emotional) were induced through slides. Cognitive stress induced through slides containing logical problems; emotional stress induced through slides of human bodies showing skin disease/accident injuries

60 (33 male, 27 female)

Partially scripted:  subjects made 3 vocal responses to each slide within a 40sec presentation period - a numerical answer followed by 2 short statements. The start of each was scripted and subjects filled in the blank at the end, e.g. ‘Die Antwort ist Alternative …’


Iriondo et al. (Iriondo et al., 2000) 

Desire, disgust, fury, fear, joy, surprise, sadness

Contextualised acting: subjects asked to read passages written with appropriate emotional content

8 subjects reading paragraph length passages (20-40mmsec each)

Non interactive and scripted


Mozziconacci (Mozziconacci, 1998)  Note: database recorded at IPO for SOBUproject 92EA.

Anger, boredom, fear, disgust, guilt, happiness, haughtiness, indignation, joy, rage, sadness, worry, neutrality

Contextualised acting: actors asked to read semantically neutral sentences in range of emotions, but practised on emotionally loaded sentences beforehand to get in the right mood

3 subjects reading 8 semantically neutral sentences  (each repeated 3 times)

Non interactive and scripted  


McGilloway (McGilloway, 1997; Cowie and Douglas-Cowie, 1996)

Anger, fear, happiness, sadness, neutrality

Contextualised acting: subjects asked to read passages written in appropriate emotional tone and content for each emotional state

40 subjects reading 5 passages each

Non interactive and scripted


Belfast structured Database  An extension of McGilloway database above  (Douglas- Cowie et al. 2000)

Anger, fear, happiness, sadness, neutrality

Contextualised acting: subjects read 10 McGilloway- style passages  AND 10 other passages - scripted versions of naturally occurring emotion in the Belfast Naturalistic Database

50 subjects reading 20 passages

Non interactive and scripted


Danish Emotional Speech Database (Engberg et al., 1997)

Anger, happiness sadness, surprise neutrality


4 subjects read  2 words, 9 sentences & 2 passages in a range of emotions

Scripted  (material not emotionally coloured)


Groningen ELRA corpus number S0020 (

this new link is working (date:16/01/2005):(

Database only partially oriented to emotion


238 subjects reading 2 short texts



Berlin database (Kienast & Sendlmeier 2000; Paeschke & Sendlmeier 2000)

Anger- hot, boredom, disgust, fear- panic, happiness, sadness-sorrow, neutrality


10 subjects (5 male, 5 female) reading 10 sentences each

Scripted   (material selected to be semantically neutral)


Pereira (Pereira, 2000)

Anger (hot), anger (cold), happiness, sadness, neutrality


2 subjects reading 2 utterances each

Scripted (1 emotionally neutral sentence, 4 digit number) each repeated


van Bezooijen (van Bezooijen, 1984)

Anger, contempt disgust, fear, interest joy, sadness shame, surprise, neutrality


8 (4 male, 4 female) reading 4 phrases

Scripted  (semantically neutral phrases)


Abelin (Abelin 2000)

Anger, disgust, dominance, fear, joy, sadness, shyness, surprise


1 subject

Scripted  (semantically neutral phrase)


Yacoub et al (2003)

(data from LDC,

15 emotions

Neutral, hot anger, cold anger, happy, sadness, disgust, panic, anxiety, despair, elation, interest, shame, boredom, pride, contempt


2433 utterances from 8 actors



Table 3: Databases of Faces


Emotional  content

Emotion elicitation methods


Nature of material

The AR Face Database   (

Smile, anger, scream neutral


154 subjects ( 82 male, 74 female)

26 pictures per person

1 : Neutral, 2 Smile, 3 : Anger, 4 : Scream , 5 : left light on, 6 : right light on, 7 : all side lights on, 8 : wearing sun glasses, 9 : wearing sun glasses and left light on, 10 :  wearing  sun glasses and right light on, 11 : wearing scarf, 12 : wearing scarf and left light on, 13 : wearing scarf and right light on, 14 to 26 : second session (same conditions as 1 to 13)

CVL Face Database   (



114  subjects (108 male, 6 female) 

7 pictures per person

Different angles, under uniform illumination, no flash and with projection screen in the background

The Psychological Image Collection at Stirling   (

Smile, surprise, disgust


Aberdeen: 116 subjs

Nottingham scans: 100

Nott-faces-original: 100

Stirling faces:36

Contains 7 face databases of which 4 largest are: Aberdeen , Nottingham scans, Nott-faces-original, Stirling faces
Mainly frontal views, some profile, some differences in lighting and expression variation

The Japanese Female Facial Expression (JAFFE) Database   (

Sadness, happiness, surprise, anger, disgust, fear, neutral


10 subjects 

7 pictures per subject

6 emotion expressions + 1 neutral posed by 10 Japanese female models

CMU PIE Database (CMU Pose, Illumination, and Expression (PIE) database)   (

Neutral, smile, blinking and talking

Posed for neutral, smile and blinking

2 secs video capture of talking per person

68 subjects

13 different poses, 43 different illumination conditions, and with 4 different expressions.

Indian Institute of Technology Kanpur Database  (

Sad, scream, anger, expanded cheeks and exclamation, eyes open-closed, wink


20 subjects 

Varying facial expressions, orientation and occlusions; degree of orientation is from 00 to 200 in both right and left direction, the similar angle variation are considered in case of head tilting; and also head rotations both in top and bottom are taken into account. All of these images are taken with and without glasses in constant background; for occlusions some portion of face is kept hidden and lightning variations are considered.

The Yale Face Database   (

Sad, sleepy, surprised


15 subjects

One picture per different facial expression or configuration: center-light, w/glasses, happy, left-light, w/no glasses, normal, right-light, sad, sleepy, surprised, and wink

CMU Facial Expression Database (Cohn-Kanade)   (

Six of the displays were based on descriptions of prototypic emotions (i.e., joy, surprise, anger, fear, disgust, and sadness).


200 subjects

Subjects were instructed by an experimenter to perform a series of 23 facial displays that included single action units (e.g., AU 12, or lip corners pulled obliquely) and combinations of action units (e.g., AU 1+2, or inner and outer brows raised).  Subjects began and ended each display from a neutral face

Caltech Frontal Face DB   (


27 subjects

450 images in total

Different lighting, expressions, backgrounds

HumanScan  BioID Face DB   (



23  subjects

contains 19 manual markup points:  0 = right eye pupil 1 = left eye pupil 2 = right mouth corner 3 = left mouth corner 4 = outer end of right eye brow 5 = inner end of right eye brow 6 = inner end of left eye brow 7 = outer end of left eye brow 8 = right temple 9 = outer corner of right eye 10 = inner corner of right eye 11 = inner corner of left eye 12 = outer corner of left eye 13 = left temple 14 = tip of nose 15 = right nostril 16 = left nostril 17 = centre point on outer edge of upper lip 18 = centre point on outer edge of lower lip 19 = tip of chin

Oulu University Physics-Based Face Database   (



125  subjects

 All frontal images: 16 different camera calibration and illuminations




20 subjects, 19-36 pictures per person

Range of poses from profile to frontal views

Olivetti Research   (



40 subjects, 10 pictures per person  

All frontal and slight tilt of head 

The Yale Face Database B    (



10 subjects

 9 poses x 64 illumination conditions

AT&T   (formerly called ORL database)


Smiling / not smiling


  40 subjects

10 images for each subject which vary lighting, glasses/no glasses, and aspects of facial expression broadly relevant to emotion –open/closed eyes, smiling/not smiling 

Powered by Plone

Portal usage statistics