next up previous index
Next: Clustering trajectories into prototypes Up: The tracker Previous: Detecting pairs of eyes   Index



Improvements of the tracking

There are several possible ways of improving this tracker. Indeed, the tracker actually does not use time information. The fact that two consecutive frames look similar helps in the computation complexity of the tracker. The location in the second frame can be found by looking for the face in a sub-region of the image, centered in the middle of the previously detected face. The size of this sub-region should be larger than the size of the face.

We used an hysteresis cycle to include this sub-region search in the tracker. The algorithm works as follow :

1)
Search for a face in the whole image using eyes features. Compute the probability of the best match.
2)
If the probability of the best match is above a predefined threshold $t_h$, go to step 2. Otherwise load the next frame and apply step 1.
3)
Compute a surrounding box around the face given the size of the aligned mean shape of the AAM.
4)
Scale this box with a predetermined factor $\lambda$.
5)
Load the next frame.
6)
Search for eyes features in the new box and compute the probability of the best match.
7)
If the probability of the best match is below a predefined threshold $t_l$, then load the next frame and apply step 1. Otherwise, apply step 3.

So, the tracker has two modes : a global search and a local search. It swaps between one mode and the other according to the value of the best match found in the current frame. Figure 4.3 shows how the hysteresis cycle works in our case.

Figure 4.3: Hysteresis cycle used for the location of eyes. When the probability of best match is below $t_l$ a global search is performed. When it is above $t_h$ a local search is performed.
\begin{figure}\begin{center}
\epsfbox{hysteresis.ps}
\end{center}
\end{figure}

Experiments have shown that a factor $\lambda=1.1$ was a good choice. If $\lambda$ is too small, the local search often fails, and the global search has to found the eyes. This results in a lost of speed. If $\lambda$ is too big, then the bounding box used to look for eyes is too big so the search is slower and the possibility of false positive increases. So the choice of $\lambda$ is important.

Unfortunately, the algorithm actually relies on constant that have to be determined manually. Indeed, $t_h$ and $t_l$ depends on a lot of parameters such as the lighting conditions or the distance between the face and the camera. It would be nice to be able to automatically find the size of the bounding box that should be used. A model of the motion of face acting in similar conditions may provide a good estimate of this bounding box.

Another improvement can be done by choosing good clustering algorithm for the detection of the eyes. An early try of support vector machine (SVM) [11] on eyes data extracted from the training set has been done. The result seems to suggest that the detection can be improved a lot using this clustering method.


next up previous index
Next: Clustering trajectories into prototypes Up: The tracker Previous: Detecting pairs of eyes   Index

franck 2006-10-16