next up previous index
Next: Aims and objectives Up: Introduction Previous: Introduction   Index



Motivation

Communication plays an important role in our society. It takes different forms, but face to face remains the most important form of communication. Therefore, human-computer interfaces should consider to mimic face to face communication in order to achieve a natural interaction with a human user.

As pointed out by Raudsepp in [48], 7% of the force of a communication is verbal, 38% is vocal and 55% lies in the body language. The verbal part of a conversation is the content of the conversation itself. The vocal part is the paralanguage, that is the tone of voice, intonation, pauses and sights. Finally, the body language consists of posture, distance maintained with the speaker, eye contact, gestures and facial expression. The necessary eye contact in a conversation suggests that a human-computer interface should display a picture of a face on the screen to enable the user to make eye contact.

Gestures have been extensively studied by the machine vision community in order to enable computers to understand humans using their natural way of expressing themselves. Results of these studies can provide an elegant alternative to input devices used with computers especially for interaction in 3D environments.

More recently, facial expressions have been studied. Classification or assessment of facial expressions from a video sequence can be used by computers to understand how the user feels and react in an adapted manner to its emotional state. Indeed, even if it is often difficult for a human to detect an expression on a face of a person, the face reflects most of our emotions. Most of the studies in facial expression classification concentrate on extracting the six basic emotions from images or sequences of images : happiness, sadness, surprise, fear, anger and disgust. A trained human is able to distinguish these emotions using facial clues with an error average of 13% [46].

Building a human-computer interface based on visual clues requires several stages. The first stage of an ideal human-computer interface is a tracking system that is able to locate the face of the user each time it is required. The tracking problem is a really challenging one because of the variability of the expressions one can show. Furthermore, in order to be able to bring useful informations to the next stages, it has to be precise and robust. Facial hairs, glasses, occlusions and sensor noise are problems that make this task difficult.

The second stage of a human-computer interface is the analysis of the user's face sequence in order to extract informations such as facial expression or the direction where the user is looking at. We can then combine these informations to deduce the state of the user and decide how the computer has to react. Then, in a third stage, the computer has to synthetise a virtual face which seems to react in accordance to what we decided in the previous stage.

Unfortunately, such an ideal human-computer interface is still in its infancy. In order to achieve such a complex task, a good model of face sequences is needed and it is what we want to concentrate on.


next up previous index
Next: Aims and objectives Up: Introduction Previous: Introduction   Index

franck 2006-10-16