Abstract
In this paper, we focus on exploring modality-temporal mutual information for RGB-D action recognition. In order to learn time-varying information and multi-modal features jointly, we propose a novel deep bilinear learning framework. In the framework, we propose bilinear blocks that consist of two linear pooling layers for pooling the input cube features from both modality and temporal directions, separately. To capture rich modality-temporal information and facilitate our deep bilinear learning, a new action feature called modality-temporal cube is presented in a tensor structure for characterizing RGB-D actions from a comprehensive perspective. Our method is extensively tested on two public datasets with four different evaluation settings, and the results show that the proposed method outperforms the state-of-the-art approaches.
Original language | English |
---|---|
Title of host publication | ECCV 2018 |
Subtitle of host publication | Computer Vision - ECCV 2018 |
Editors | Vittorio Ferrari, Cristian Sminchisescu, Martial Hebert, Yair Weiss |
Place of Publication | Switzerland |
Publisher | Springer |
Pages | 346-362 |
Number of pages | 17 |
Volume | 11211 |
ISBN (Electronic) | 9783030012342 |
ISBN (Print) | 9783030012335 |
DOIs | |
Publication status | Published - 2018 |
Event | European Conference on Computer Vision 2018 - Munich, Germany Duration: 8 Sept 2018 → 14 Sept 2018 https://eccv2018.org/ |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 11211 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | European Conference on Computer Vision 2018 |
---|---|
Abbreviated title | ECCV 2018 |
Country/Territory | Germany |
City | Munich |
Period | 8/09/18 → 14/09/18 |
Internet address |
Keywords
- Cube
- Deep bilinear
- Feature learning
- RGB-D action
ASJC Scopus subject areas
- Theoretical Computer Science
- General Computer Science