The Vision and Media Lab
Human Pose Estimation using Motion Exemplars




In this work, we try to estimate the body configuration of a person in a monocular video. We use a motion correlation technique to measure the motion similarity in various space-time locations between the input video and stored video templates. This correlation is done at coarse to fine scales around the joint positions to ease handling the variance in size and motion between subjects. These observations are used to predict the conditional state distributions both for exemplars and joint positions. The graphical model that represents relation between joints at sequence of frames is so complicated which makes the inference impractical. To overcome this problem we have represent the body configuration at every frame with an exemplar. In the images below the first row shows the input video and second row shows the best matching sequence of exemplars found from training data.



Joint Position estimation is then solved using Gibbs Sampling and Gradient Ascent.


Sample code for running the Schechtman and Irani's cvpr05 is available.

Some videos of results are available, side view fast walk, side view incline, side view slow walk and 45' view fast walk.

 Alireza Fathi and Greg Mori. Human Pose Estimation using Motion Exemplars. IEEE International Conference on Computer Vision, 2007. [pdf]