Video sequence V3: dialog

Next: Conclusion Up: Description of the data Previous: Video sequence V2: change Index

Video sequence V3: dialog

The last video sequence we used to assess our behaviour model is of a person in a dialogue. Since the person was speaking with someone else, the video includes talking, listening, laughing, and so on. The video covers a range of expressions. No scripting was used so that the conversation was spontaneous and natural.

This video was acquired with a JVC GR-DV2000 digital video camera. It contains 88044 frames. Figure 4.13 shows some frames extracted from that video sequence. The file examples/V3/V3_orig.m1v of the accompanying CD-ROM shows the full sequence.

**Figure 4.13:** Frames extracted from the video V3.
$\includegraphics[width=145mm,keepaspectratio]{seq_craig_dialog_orig.eps}$

We used 20000 frames from the video sequence V3 to train our behaviour model. Since the video was taken at 25 frames per second, those 20000 frames correspond to 13 minutes and 20 seconds. This length of training sequence is sufficient to train the behaviour model properly since it contains many repeated expressions. Most videos used by other works in this area are much shorter (a few seconds).

168 frames of video V3 have been marked up by hand using 96 points as depicted on figure 4.14. The face was successfully tracked through most of the video sequence. The active appearance model failed to model correctly the face on a small part of the video sequence. This is because the face was not entirely on the frame; the person on the video moved out of the visual field of the camera.

**Figure 4.14:** Hand labelling of a frame from video V3.
$\includegraphics[width=145mm,keepaspectratio]{craig_dialog_marked.eps}$

The first mode of variation of the appearance model extracted from video V3 is shown on figure 4.15. The relative coordinates of those points as well as the pose, the scale and the position of the face have been reduced to 14 parameters for each frame.

**Figure 4.15:** First mode of variation of the appearance model extracted from video V3.
$\includegraphics[height=30mm,keepaspectratio]{craig_mode1.eps}$

A synthesised version of those parameters can be seen by watching the video file examples/V3/V3_track.m1v of the accompanying CD-ROM. Frames extracted from this file are shown on figure 4.16.

**Figure 4.16:** Frames extracted from the video V3 after tracking.
$\includegraphics[width=125mm,keepaspectratio]{seq_craig_dialog.eps}$

Next: Conclusion Up: Description of the data Previous: Video sequence V2: change Index

franck 2006-10-01