Accompanying CD-ROM The CD-ROM inside the back cover of this thesis contains several video sequences referenced in the text. The video sequences are referenced according to their filenames on the CD-ROM:
tracked/V1_graph.m1v, video V1 along with a plot of the tracked parameters,
tracked/aam_orig.m1v, video comparing an original video sequence with the resynthesised frames from the tracked parameters,
examples/V1/V1_orig.m1v, the original video sequence V1,
examples/V1/V1_track.m1v, video V1 resynthesised from tracked parameters,
examples/V1/V1_arp.m1v, generated from video V1 using the autoregressive process,
examples/V1/V1_wor.m1v, generated from video V1 using our model without residuals (greedy algorithm),
examples/V1/V1_wr.m1v, generated from video V1 using our model and a linear residual model (greedy algorithm),
examples/V1/V1_ncwor.m1v, generated from video V1 using our model without residuals (normalised cuts algorithm) ,
examples/V1/V1_ncwr.m1v, generated from video V1 using our model and a linear residual model (normalised cuts algorithm),
examples/V2/V2_orig.m1v, the original video sequence V2,
examples/V2/V2_track.m1v, video V2 resynthesised from tracked parameters,
examples/V2/V2_arp.m1v, generated from video V2 using the autoregressive process,
examples/V2/V2_wor.m1v, generated from video V2 using our model without residuals (greedy algorithm),
examples/V2/V2_wr.m1v, generated from video V2 using our model and a linear residual model (greedy algorithm),
examples/V3/V3_orig.m1v, the original video sequence V3,
examples/V3/V3_track.m1v, video V3 resynthesised from tracked parameters,
examples/V3/V3_arp.m1v, generated from video V3 using the autoregressive process,
examples/V3/V3_wor.m1v, generated from video V3 using our model without residuals (greedy algorithm),
examples/V3/V3_wr.m1v, generated from video V3 using our model and a linear residual model (greedy algorithm);
exp/m1v/q01.m1v to exp/m1v/q52.m1v are the videos used for the psychophysical experiment (in mpeg format);
exp/gif/q01.gif to exp/gif/q52.gif are the videos used for the psychophysical experiment (in animated gif format).
index.htm.
Abstract
Statistical appearance models are used to model objects from images using their shape and texture. Such models have been applied successfully in a large number of applications. Nevertheless, the appearance model does not model video sequences of animated deformable objects.
The aim of this thesis is to add a temporal dimension to the appearance model in order to properly represent movement in video sequences. We apply this extended model to the study of facial behaviour.
The method uses a statistical framework learnt from a training video sequence. The series of parameters extracted from the sequence is modelled by a set of pathlets in parameter space. A higher level model learns how to organise the pathlets into meaningful sequences representing facial expressions or typical movements of the head.
A measure of quality of generated video sequences is derived. This measure shows that our model outperforms an alternative based on autoregressive processes. A forced choice psychophysical experiment confirms this conclusion. Acknowledgements I would like to thank my supervisor, Dr Tim Cootes for his useful advice, his encouragement, guidance and support over the past four years.
I would like to thank all the members of the Imaging Science and Biomedical Engineering group for providing such a good work environment.
I would like to thank my parents for supporting me.
Finally, I would like to thank all the people who volunteered for the psychophysical experiment set up to assess this thesis' framework. I would particularly like to thank Lilian, Nicolas, Fabrice, Juana, Gilles, Vivek, Jun, Alexandre, Arnaud, Bruno, Laurent, Fernand, Marie-Jeanne, Paul, José, Kostas, Domitille, Xavier, Sylvie, David, Kolawole, Roy, John, Mike, Patrick and Panachit for their help and their patience. Publications Some of the work described in this thesis has also appeared in: