Information and Communication Technology Seminar, Vol. 1 No. 1, August 2005
ISSN 1858-1633 2005 ICTS 154
MULTIMODAL-ELIZA PERCEIVES AND RESPONDS TO EMOTION
S. Fitrianie and L.J.M. Rothkrantz
Man-Machine-Interaction Group, Delft University of Technology E-mail: {s.fitrianie, l.j.m.rothkrantz}ewi.tudelft.nl
ABSTRACT
A growing number of research programs aimed at development human-computer dialogues to be more
like human-human dialogues. We develop a question answering system that can perceive and respond to
user emotions. Based on the famous Weizembaum’s Eliza, the system can communicate with human users
using typed natural language. It is able to reply with text prompts and appropriate facial expressions. An
experiment has been conducted to determine how many and what kind emotional expressions produced
by humans during conversation. Keywords :Weizembaum’s Eliza, human-computer
dialogue, emotion
1. INTRODUCTION
Emotions play an important role in communication. They are communication and control
systems within the brain that mobilize resources to accomplish the goals specified by our motives.
Humans convey their emotion thoughts through verbal and nonverbal behaviors synchronously.
Composing linguistic contents is probably the only method that can simultaneously convey speaker’s
belief, intentions, meta-cognitive information about mental state along with the speaker’s emotional state.
We are used to convey our thought through our conscious or unconscious choice of words. Some
words possess emotive meaning together with their descriptive meaning. The descriptive meaning of this
type of words along with a sentence structure plays a cognitive role in forming beliefs and understanding.
The instantaneous emotional state is directly linked with the displayed expression 0. Emotion expressions
have three major functions: 1 they contribute to the activation and regulation of emotion experiences; 2
they communicate internal states and intentions to others; and 3 they activate emotion in others, a
process that can help account for empathy and altruistic behaviour. The human face in particular
serves not only communicative functions, but they are also the primary channel to express emotion. Each
facial expression provides very different
information. Seeing faces, interpreting their expression, understanding the linguistics contents of
speech are all part of our development and growth. Many researchers showed that the capability of
communicating with humans using both verbal and nonverbal behaviors will make the interaction more
intimate and human-like 000. Using facial displays as means to communicate have been found to provide
natural and compelling computer interfaces 000. The challenge is that facial expressions do not occur
randomly, but rather are synchronized to one’s own speech or to the speech of other 00.
As a proof of concept, we developed a demonstrator of a multimodal question answering
system based on the famous Eliza program 0. The system simulates human-human conversation using
typed natural language. It is capable to reason about emotions in the natural language. This system will
show a facial expression for each user input as its stimulus response. Subsequently, it will give a natural
language reply together with an appropriate facial expression to convey emotional content. Our
developed system has a list of facial expressions that corresponds to possible emotions.
2. NATURAL LANGUAGE PROCESSING
As most of question answering QA systems nowadays, Eliza worked by simple pattern matching
operation and substitution of keywords 0. It used two transformation rules that were associated with certain
keywords: 1 a decomposition rule serves to decompose an input string according to a pattern; and
2 a reassemble rule serves to reassemble a reply sentence. The original approach had three problems 0:
1 lack of anaphoric analysis, it could not use previous conversation to keep the continuity of the
content and to store information about user; 2 lack of ability to restrict the conversation on its topic and 3
lack of ability to get the meaning beyond the sentence.
Wallace proposed to use an extended-XML script, called AIML to control his QA system, A.L.I.C.E 0.
AIML has two additional transformation rules: 1 a current conversation topic pattern rule; and 2 a
history pattern rule that refers to the system’s previous reply. In addition, using XML syntax, we can add tags
to retrieve information about users from conversations and use it in the next dialogues. The matching
operation searches the best match input pattern. It will search first in the same conversation topic and the
same history pattern. By this way, A.L.I.C.E has more
The research reported here is part of the Interactive Collaborative Information Systems ICIS project, supported by the Dutch
Ministry of Economic Affairs, grant nr: BSIK03024
Multimodal-Eliza Perceives and Responds to Emotion – S. Fitrianie L.J.M Rothkrantz
ISSN 1858-1633 2005 ICTS 155
possibilities of reply sentences based on their topic and history than Eliza. Our developed QA system uses
Wallace’s pattern matching operation.
3. ADDING NONVERBAL BEHAVIOUR
Our developed system is capable to extract emotion indications or emotion eliciting factors from
a dialog. The system will reason the results to trigger one of possible displayed expressions. As a reference,
we have performed an experiment to determine a list of possible expressions applied by our QA system.
3.1. Dialog Processing
Our prototype extracts emotion-eliciting factors from a dialog using two approaches. First, the system
analyzes the choices of words in a string. For this purpose, we developed an emotive lexicon dictionary.
Currently, it consists of 347 emotion words merged from 000. Based on 0, the words were depicted into
eight octants of valence-arousal see table 1. For some ambiguous emotion words, we used a thesaurus
to figure out the closeness semantic meaning of the words with other words within an octant. A parser
matches the string against the dictionary and calculates a counter C for “pleasant” and
“unpleasant” using the following equation:
∀ l
i
∈ d
i
| C
i
t = C
i
t-1 + I
i
. s ∀ j
≠ i| C
j
t = C
j
t-1 – I
i
3
Where, l is the lexicon and d is the dictionary, i is the active pleasantness, I is the lexicon’s arousal
degree, s is a summation factor, and j is [pleasant, unpleasant]. The system will take the counter with the
highest values.
category affect name=”neutral” patternWHAT IS YOUR NAMEpattern
thatthat template
setconcernpleasantsetconcern setaffectpleasantsetaffectMy
set_topicnameset is bot name=”name”.
template affectcategory
topic name=NAME category affect name=”unpleasant”
thatMY NAME IS that patternYOUR pattern
templaterandom lisetconcernpleasantsetconcern
I am sorry, but tell me your name.li
lisetconcernunpleasantsetconcern I am sorry, tell me what
happened.li random template
affectcategory ...
Figure 21. Example units in the AIML database.
Finally, the system extracts the dialog emotional situation. For this purpose, we added two labels in the
AIML scheme see figure 3: 1 a label to distinct a user’s emotional situation “affect”; and 2 a label
to distinct the system’s emotional situation “concern”. These labels describe a type of a
valance neutral, pleasant or unpleasant or a sign of a joke. By these additional tags, the input pattern
matching operation searches first then not only in the same conversation topic and the same history pattern,
but also in the same user’s emotional situation. By this way, the tag also indicates the conversation’s
emotional situation.
3.2. Emotion Expression Experiment
How many and what kind of displayed emotional expressions are used in conversation poses a non-
trivial question. Many theorists and psychologists tried to categorized emotion types, e.g. 000. An experiment
has been performed to recognize the most expressive facial expressions used in conversations. This
experiment also addressed to figure out what kind objects, events, and actions that triggered these
expressions.
We recorded four dialogs of two participants. The participants were requested to perform dialogues
about different topics and show as many expressions as possible. The video recordings were amounted. As
a first step, three independent observers marked the onset and offset of an expression. In the next step,
these expressions were labelled according to the context. The agreement rates between the observers in
both steps were about 73.
The experimental results indicated that our participants showed most of the time a neutral face.
However, we managed to capture in total 40 different facial expressions; about 20-35 different expressions
per participant in each dialog. The results also showed that the expressions were dependent not only on the
choices of words but also on the context of the conversation. A word could mean different things
according to the context of the conversation. Thereby, the speaker or the listener might display different
facial expressions.
Our experimental results were endorsed by an experiment conducted by Desmet 0. He found 41
displayed emotion expressions actually used to appraise a product table 1 – our experimental results
did not have “greedy”. Based on 0, he depicted these expressions in two dimensions degree of
“pleasantness” valence and “activation” arousal.
Table 2. Emotions in Eight Octants, modified from 0
No Valence-Arousal Emotion
Expressions
1. Neutral-Excited
Curious, amazed, avaricious, stimulated, concentrated,
astonished, eager. 2.
Pleasant- Excited
Inspired, desiring, loving 3.
Pleasant- Average
Pleasantly surprised, fascinated, amused, admiring,
sociable, yearning, joyful 4.
Pleasant-Calm Satisfied, softened
5. Neutral-Calm Awaiting,
deferent 6.
Unpleasant- Bored, sad, isolated,
Information and Communication Technology Seminar, Vol. 1 No. 1, August 2005
ISSN 1858-1633 2005 ICTS 156
No Valence-Arousal Emotion
Expressions
Calm melancholy, sighing
Unpleasant- Average
Disappointed, contempt, jealous, dissatisfied, disturbed,
flabbergasted, cynical Unpleasant-
Excited Irritated, disgusted, indignant,
unpleasantly surprised, frustrated, greedy, alarmed,
hostile
3.3. Facial Expression Generation
Based on the findings in the experiment, in this work we adopted the work of 0. Current developed
QA system has 41 static facial expressions depicted in eight octants of valence-arousal in the table 1.
User: What is your name?
Eliza Eliza:
My name is Eliza. User:
I hate you. Eliza
Eliza: Why? Did I do something wrong?
User: Your name reminds
me to my ex- girlfriend who
cheated on me. Eliza
Eliza: I am sorry, but tell me your
name. Figure 22. A dialog fragment between a user and Eliza
Figure 2 displays an example of a dialog fragment between our QA system Eliza and its user. When the
system receives a user string input, it displays a facial expression as its stimulus response. Another facial
expression is displayed to convey the system’s reply sentence. The system uses one to one corresponding
facial expressions and emotions. The following section explains how the system reasons its own
emotional state.
4. EMOTION REASONING
Figure 23. Emotion Model
Figure 3 shows the emotion model of our prototype QA system. The system was designed to
have an agreeable personality. We describe each component of the emotion model in the following.
Emotion Eliciting Factors Interpretation
Besides extracting emotion indications in a dialog see section 3.1, based on 0, the system also assesses
whether its current goal is achieved, whether the situation upholds or violates its principles, and
whether the preferences are gained. For this purpose, we defined the system’s properties and rules that
define its goals, principles, and preferences. For example:
If a user says bad words then principle is violated
If a user was sad and now is happy then goal is achieved
Stimulus Response
We defined rules for the system’s stimulus response on the emotion eliciting factors in user’s
input and the system’s current mood. An example of these rules is:
If input pleasantness C is pleasant-calm and
affect is not unpleasant and goal is achieved and
preference is neutral and principle is neutral and
system current mood is happy and system emotion activation is calm
Then system response is satisfied
Cognitive Processing
The cognitive processing involves in creating a reply sentence and a response that conveys the reply.
To determine the response, we also defined rules based on the system’s mood and the emotion eliciting
factors in both the user’s input and the system’s reply. For example:
If input C is unpleasant-excited and affect is unpleasant and
goal is not achieved and preference is neutral and
principle is neutral and system mood is happy and
system emotion activation is calm and reply C is unpleasant-excited and
concern is unpleasant Then system response is alarmed
Mood
To design the system’s mood or an emotion that last longer, it is necessary to observe the intensity of
the system’s emotional state during conversation. To simplify, our prototype uses six affective
thermometers classified by six Ekman’s universal emotions: happiness, sadness, anger, surprise, disgust,
and fear 0. They change their value affected by the result of the cognitive processing. If an expression is
active, the system will check its correspondence with the universal emotions based on table 2. It calculates
all thermometers T using the following equation:
T
i
t = T
i
t-1 + I
i
. s ∀ j
≠ i| T
j
t = T
j
t-1 - distance[j, i]
Where, i is the active universal emotion type, s is a summation factor, I is the emotion expression’s
arousal degree, and j ranges over all universal emotion types in table 2. The distance between two universal
Multimodal-Eliza Perceives and Responds to Emotion – S. Fitrianie L.J.M Rothkrantz
ISSN 1858-1633 2005 ICTS 157
emotions follows the work of Hendrix and Ruttkay see table 3 0. The emotion type with the highest
value of the thermometers is considered as the system’s current mood. The mood and its value as the
emotion’s activation –calm, average or excited are used in both knowledge bases to reason the system’s
emotional state.
Table 3 Universal emotions-Emotion Expressions
Universal Emotions
Emotion Expressions
Happy Inspired, desiring, loving, fascinated,
amused, admiring, sociable, yearning, joyful, satisfied, softened
Sad Disappointed, contempt, jealous,
dissatisfied, disturbed, flabbergasted, cynical, bored, sad, isolated, melancholy,
sighing Surprise
Pleasantly surprise, amazed, astonished Disgust Disgusted,
greedy Anger Irritated,
indignant, hostile
Fear Unpleasantly surprised, frustrated, alarmed
Neutral Curious, avaricious,
stimulated, concentrated, eager, awaiting, deferent
Table 4. Distance values between emotions 0
Happin ess
Surpri se
Anger Disg ust
Sadne ss
Happiness
0 3.195 2.637 1.926 2.554
Surprise
3.436 2.298
2.084
Anger
1.506 1.645
Disgust
1.040
Sadness
5. CONCLUSION