The main aim in machine vision is to understand images in a way a human would do. By extension, it also aim at understanding image sequences. In the first section of this chapter, we review how people have tackle with the difficult problem of tracking faces in a video sequence. In a second session, we review how people have tried to model behaviour so far and how a few people tried to model interactions between different actors in a video sequence.