Unsupervised Mapping and Semantic User Localisation from First-Person Monocular Video

Tamas Suveges, Stephen McKenna (Lead / Corresponding author)

Research output: Contribution to journalArticlepeer-review

3 Downloads (Pure)

Abstract

We propose an unsupervised probabilistic framework for learning a human-centred representation of a person’s environment from first-person video. Specifically, non-geometric maps modelled as hierarchies of probabilistic place graphs and view graphs are learned. Place graphs model a user’s patterns of transition between physical locations whereas view graphs capture an aspect of user behaviour within those locations. Furthermore, we describe an implementation in which the notion of place is divided into stations and the routes that interconnect them. Stations typically correspond to rooms or areas where a user spends time. Visits to stations are temporally segmented based on qualitative visual motion. We describe how to learn maps online in an unsupervised manner, and how to localise the user within these maps. We report experiments on two datasets, including comparison of performance with and without view graphs, and demonstrate better online mapping than when using offline clustering.

Original languageEnglish
Article number110923
JournalPattern Recognition
Early online date22 Aug 2024
DOIs
Publication statusE-pub ahead of print - 22 Aug 2024

Keywords

  • Egocentric (first-person) vision
  • Unsupervised Learning
  • Mapping and localisation

Fingerprint

Dive into the research topics of 'Unsupervised Mapping and Semantic User Localisation from First-Person Monocular Video'. Together they form a unique fingerprint.

Cite this