Multi-view learning-based data proliferator for boosting classification using highly imbalanced classes

Olfa Graa, Islem Rekik (Lead / Corresponding author)

Research output: Contribution to journalArticle

Abstract

Background: Multi-view data representation learning explores the relationship between the views and provides rich complementary information that can improve computer-aided diagnosis. Specifically, existing machine learning methods devised to automate neurological disorder diagnosis using brain data provided new insights into how a particular disorder such as autism spectrum disorder (ASD) alters the brain construct. However, the performance of machine learning methods highly depends on the size of the training samples from both classes. In a real-world clinical setting, such medical data is very expensive and challenging to collect, might (i) suffer from several limitations such as imbalanced classes and (ii) have non-heterogeneous distribution when derived from multi-view brain representations.

New method: To the best of our knowledge, the problem of imbalanced and multi-view data classification remains unexplored in the field of network neuroscience. To fill this gap, we propose a Multi-View LEArning-based data Proliferator (MV-LEAP) that enables the classification of imbalanced multi-view representations. MV-LEAP comprises two key steps. First, a manifold learning-based proliferator, which enables to generate synthetic data for each view, is developed to handle imbalanced data. Second, a multi-view manifold data alignment leveraging tensor canonical correlation analysis is proposed to map all original and proliferated (i.e., synthesized) views into a shared subspace where their distributions are aligned for the target classification task.

Results: We evaluated our method on imbalanced multi-view ASD vs. normal control (NC) connectomic datasets with imbalanced classes.

Conclusion: Overall, MV-LEAP achieved the best classification results in comparison with baseline data synthesis methods.

Original languageEnglish
Article number108344
Pages (from-to)1-12
Number of pages12
JournalJournal of Neuroscience Methods
Volume327
Early online date14 Aug 2019
DOIs
Publication statusPublished - 1 Nov 2019

    Fingerprint

Keywords

  • Imbalanced classification
  • Multi-view data
  • Manifold learning
  • Data proliferator
  • Brain network synthesis
  • Connectomic data distribution alignment
  • Tensor canonical correlation analysis

Cite this