next up previous index
Next: Conclusion Up: The psychophysical experiment Previous: Statistics   Index




Results of the psychophysical experiment

Of the 112 people who volunteered, only 43 completed all the questions. Technical problems and the length of the experiment deterred the rest. In the following we present an analysis based on the 43 complete surveys.

Figure 8.9 shows a graph of the time used to answer the questions. We can see three different types of response on this graph:

The three different types of response are separated by dotted lines on the graph, corresponding to the end of the video either played once or twice.

Figure 8.9: Time taken to answer the questions in the experiment. One dotted line corresponds to the time taken by the video. The other one corresponds to twice that time.
\includegraphics[width=145mm,keepaspectratio]{timegraph.eps}

Since most answers belong to the two last categories, it appears that the experiment has been done carefully by the persons who took the time to complete it.

When comparing two videos clips both extracted from the original sequence, people tend to choose the video on the right rather than the video on the left. Indeed, 72 videos have been selected on the left while 99 videos have been selected on the right. This gives a significant bias towards the right (the probability being 0.9535).

The bias on the selected side is also present if we use all the questions. This bias is the main reason why we have only kept answers from the people completing the whole experiment. The side on which the generated videos appears should be balanced to compensate for the bias introduced in the case of random choice. If the experiment is completed, each pair of models is displayed an equal number of times on each side.

The results of the experiments can be seen on table 8.4. The table shows the number of answers given for each pair of models, for all the videos and all the volunteers. It shows that for each model, people were able to distinguish between the original video sequence and the generated ones. There is still room for improving any of the models. This result is in agreement with the results of Hack who did a similar experiment [41] and found that no model so far tested in his experiment can confound the volunteers.

The results of our experiment also show that our model performs better than the autoregressive process if the linear model of residuals is used to smooth the output. We cannot conclude anything when comparing the other possible pairs of models since the results are not significant (cases annotated by a $\star$ in table 8.4).


Table 8.4: Psychophysical experiment answers' summary. WR (respectively WOR ) is our model with (respectively without) a linear residual model. ARP is the autoregressive process. The results are reported for the whole set of questions. Each cell represents the number of answers selecting the model in the column as being more realistic than the model in the row. A star ($\star$) has been added to non-significant results.
  Original WR WOR ARP
Original   121 ($35\%$) 125 ($36\%$) 86 ($25\%$)
WR 223 ($65\%$)   168 ($49\%$) $\star$ 124 ($36\%$)
WOR 219 ($64\%$) 176 ($51\%$) $\star$   158 ($46\%$) $\star$
ARP 258 ($75\%$) 220 ($64\%$) 186 ($54\%$) $\star$  


Table 8.5 shows the results for video V2 (expressions). It shows similar results. We can also conclude that, for video V2, our model is also significantly better than the autoregressive process even when we do not use the linear model of residuals.


Table 8.5: Psychophysical experiment answers' summary for video V2. WR (respectively WOR ) is our model with (respectively without) a linear residual model. ARP is the autoregressive process. Each cell represents the number of answers selecting the model in the column as being more realistic than the model in the row. A star ($\star$) has been added to non-significant results.
  Original WR WOR ARP
Original   58 ($34\%$) 66 ($38\%$) 25 ($15\%$)
WR 114 ($66\%$)   91 ($53\%$) $\star$ 44 ($26\%$)
WOR 106 ($62\%$) 81 ($47\%$) $\star$   69 ($40\%$)
ARP 147 ($85\%$) 128 ($75\%$) 103 ($60\%$)  


Table 8.6 shows the results for video V3 (dialog). This time, only the original videos have been successfully spotted by the volunteers. There are no significant differences between any other pairs of models. However, the results suggest that our model does not perform worse than the autoregressive process when we use the linear model of residuals.


Table 8.6: Psychophysical experiment answers' summary for video V3. WR (respectively WOR ) is our model with (respectively without) a linear residual model. ARP is the autoregressive process. Each cell represents the number of answers selecting the model in the column as being more realistic than the model in the row. A star ($\star$) has been added to non-significant results.
  Original WR WOR ARP
Original   63 ($37\%$) 59 ($34\%$) 61 ($35\%$)
WR 109 ($64\%$)   77 ($45\%$) $\star$ 80 ($47\%$) $\star$
WOR 113 ($66\%$) 95 ($55\%$) $\star$   89 ($52\%$) $\star$
ARP 111 ($65\%$) 92 ($53\%$) $\star$ 83 ($48\%$) $\star$  



next up previous index
Next: Conclusion Up: The psychophysical experiment Previous: Statistics   Index

franck 2006-10-01