Multimodal Egocentric Analysis of Focused Interactions

Sophia Bano, Tamas Suveges, Jianguo Zhang, Stephen McKenna (Lead / Corresponding author)

Research output: Contribution to journalArticlepeer-review

14 Citations (Scopus)
302 Downloads (Pure)


Continuous detection of social interactions from wearable sensor data streams has a range of potential applications in domains including health and social care, security, and assistive technology. We contribute an annotated, multimodal dataset capturing such interactions using video, audio, GPS and inertial sensing. We present methods for automatic detection and temporal segmentation of focused interactions using support vector machines and recurrent neural networks with features extracted from both audio and video streams. Focused interaction occurs when co-present individuals, having mutual focus of attention, interact by first establishing face-to-face engagement and direct conversation. We describe an evaluation protocol including framewise, extended framewise and event-based measures and provide empirical evidence that fusion of visual face track scores with audio voice activity scores provides an effective combination. The methods, contributed dataset and protocol together provide a benchmark for future research on this problem. The dataset is available at
Original languageEnglish
Pages (from-to)37493-37505
Number of pages13
JournalIEEE Access
Early online date25 Jun 2018
Publication statusPublished - 25 Jun 2018


  • Social interaction
  • egocentric sensing
  • multimodal analysis
  • temporal segmentation

ASJC Scopus subject areas

  • General Computer Science
  • General Materials Science
  • General Engineering


Dive into the research topics of 'Multimodal Egocentric Analysis of Focused Interactions'. Together they form a unique fingerprint.

Cite this