We wish to compare the pathlets in space only. The difference of timings are modelled by the spatiotemporal model, so the similarity measure needs only to assess the difference of shape between two pathlets.
Two points from different pathlets might have different speeds. The points on two similar pathlets might not correspond to each other directly while still describing the same curve in space. This problem of matching pathlets is similar to the problem of matching phonemes in speech recognition. The phonemes can be pronounced with different speed so the same point in time will not correspond to the same part of the phoneme.
Dynamic time warping is commonly used to match phonemes [50]. It is a sequence alignment algorithm which non-linearly warps the timing of two sequences to find the optimal match between the two sequences (see appendix A).