This paper addresses the problem of localizing an accelerometer in the view of a stationary camera as a first step towards multi-model activity recognition. This problem is challenging as accelerometers are visually occluded, they measure proper acceleration including effects of gravity and their orientation is unknown and changes over time relative to camera viewpoint. Accelerometers are localized by matching acceleration estimated along visual point trajectories to accelerometer data. Trajectories are constructed from point feature tracking (KLT) and by grid sampling from a dense flow field. We also construct 3D trajectories with visual depth information. The similarity between accelerometer data and a trajectory is computed by counting the number of frames in which the norms of accelerations in both sequences exceed a threshold. For quantitative evaluation we collected a challenging dataset consisting of video and accelerometer data of a person preparing a mixed salad with accelerometer-equipped kitchen utensils. Trajectories from dense optical flow yielded a higher localization accuracy compared to point feature tracking.
|Title of host publication||Proceedings of the 2012 9th Conference on Computer and Robot Vision, CRV 2012|
|Number of pages||8|
|Publication status||Published - 1 Jan 2012|
Multi-Modal Recognition of Manipulation Activities through Visual Accelerometer Tracking, Relational Histograms, and User-AdaptationAuthor: Stein, S., 2014
Supervisor: McKenna, S. (Supervisor)
Student thesis: Doctoral Thesis › Doctor of PhilosophyFile