Recognising Complex Activities with Histograms of Relative Tracklets

Sebastian Stein (Lead / Corresponding author), Stephen McKenna (Lead / Corresponding author)

Research output: Contribution to journalArticle

4 Citations (Scopus)
155 Downloads (Pure)

Abstract

One approach to the recognition of complex human activities is to use feature descriptors that encode visual inter-actions by describing properties of local visual features with respect to trajectories of tracked objects. We explore an example of such an approach in which dense tracklets are described relative to multiple reference trajectories, providing a rich representation of complex interactions between objects of which only a subset can be tracked. Specifically, we report experiments in which reference trajectories are provided by tracking inertial sensors in a food preparation sce-nario. Additionally, we provide baseline results for HOG, HOF and MBH, and combine these features with others for multi-modal recognition. The proposed histograms of relative tracklets (RETLETS) showed better activity recognition performance than dense tracklets, HOG, HOF, MBH, or their combination. Our comparative evaluation of features from accelerometers and video highlighted a performance gap between visual and accelerometer-based motion features and showed a substantial performance gain when combining features from these sensor modalities. A considerable further performance gain was observed in combination with RETLETS and reference tracklet features.
Original languageEnglish
Pages (from-to)82-93
Number of pages19
JournalComputer Vision and Image Understanding
Volume154
Early online date1 Sep 2016
DOIs
Publication statusPublished - Jan 2017

Fingerprint

Trajectories
Accelerometers
Sensors
Experiments

Keywords

  • activity recognition
  • relative tracklets
  • sensor fusion
  • food preparation

Cite this

@article{6a9ffa208e914df1a053029f2ee3a510,
title = "Recognising Complex Activities with Histograms of Relative Tracklets",
abstract = "One approach to the recognition of complex human activities is to use feature descriptors that encode visual inter-actions by describing properties of local visual features with respect to trajectories of tracked objects. We explore an example of such an approach in which dense tracklets are described relative to multiple reference trajectories, providing a rich representation of complex interactions between objects of which only a subset can be tracked. Specifically, we report experiments in which reference trajectories are provided by tracking inertial sensors in a food preparation sce-nario. Additionally, we provide baseline results for HOG, HOF and MBH, and combine these features with others for multi-modal recognition. The proposed histograms of relative tracklets (RETLETS) showed better activity recognition performance than dense tracklets, HOG, HOF, MBH, or their combination. Our comparative evaluation of features from accelerometers and video highlighted a performance gap between visual and accelerometer-based motion features and showed a substantial performance gain when combining features from these sensor modalities. A considerable further performance gain was observed in combination with RETLETS and reference tracklet features.",
keywords = "activity recognition, relative tracklets, sensor fusion, food preparation",
author = "Sebastian Stein and Stephen McKenna",
note = "Funding: RCUK grants EP/G066019/1 and EP/K037293/1.",
year = "2017",
month = "1",
doi = "10.1016/j.cviu.2016.08.012",
language = "English",
volume = "154",
pages = "82--93",
journal = "Computer Vision and Image Understanding",
issn = "1077-3142",
publisher = "Elsevier",

}

Recognising Complex Activities with Histograms of Relative Tracklets. / Stein, Sebastian (Lead / Corresponding author); McKenna, Stephen (Lead / Corresponding author).

In: Computer Vision and Image Understanding, Vol. 154, 01.2017, p. 82-93.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Recognising Complex Activities with Histograms of Relative Tracklets

AU - Stein, Sebastian

AU - McKenna, Stephen

N1 - Funding: RCUK grants EP/G066019/1 and EP/K037293/1.

PY - 2017/1

Y1 - 2017/1

N2 - One approach to the recognition of complex human activities is to use feature descriptors that encode visual inter-actions by describing properties of local visual features with respect to trajectories of tracked objects. We explore an example of such an approach in which dense tracklets are described relative to multiple reference trajectories, providing a rich representation of complex interactions between objects of which only a subset can be tracked. Specifically, we report experiments in which reference trajectories are provided by tracking inertial sensors in a food preparation sce-nario. Additionally, we provide baseline results for HOG, HOF and MBH, and combine these features with others for multi-modal recognition. The proposed histograms of relative tracklets (RETLETS) showed better activity recognition performance than dense tracklets, HOG, HOF, MBH, or their combination. Our comparative evaluation of features from accelerometers and video highlighted a performance gap between visual and accelerometer-based motion features and showed a substantial performance gain when combining features from these sensor modalities. A considerable further performance gain was observed in combination with RETLETS and reference tracklet features.

AB - One approach to the recognition of complex human activities is to use feature descriptors that encode visual inter-actions by describing properties of local visual features with respect to trajectories of tracked objects. We explore an example of such an approach in which dense tracklets are described relative to multiple reference trajectories, providing a rich representation of complex interactions between objects of which only a subset can be tracked. Specifically, we report experiments in which reference trajectories are provided by tracking inertial sensors in a food preparation sce-nario. Additionally, we provide baseline results for HOG, HOF and MBH, and combine these features with others for multi-modal recognition. The proposed histograms of relative tracklets (RETLETS) showed better activity recognition performance than dense tracklets, HOG, HOF, MBH, or their combination. Our comparative evaluation of features from accelerometers and video highlighted a performance gap between visual and accelerometer-based motion features and showed a substantial performance gain when combining features from these sensor modalities. A considerable further performance gain was observed in combination with RETLETS and reference tracklet features.

KW - activity recognition

KW - relative tracklets

KW - sensor fusion

KW - food preparation

U2 - 10.1016/j.cviu.2016.08.012

DO - 10.1016/j.cviu.2016.08.012

M3 - Article

VL - 154

SP - 82

EP - 93

JO - Computer Vision and Image Understanding

JF - Computer Vision and Image Understanding

SN - 1077-3142

ER -