Machine learning and data mining frameworks for predicting drug response in cancer

An overview and a novel in silico screening process based on association rule mining

Konstantinos Vougas, Theodore Sakelaropoulos, Athanassios Kotsinas, George-Romanos P Foukas, Andreas Ntargaras, Filippos Koinis, Alexander Polyzos, Vassilis Myrianthopoulos, Hua Zhou, Sonali Narang, Vassilis Georgoulias, Leonidas Alexopoulos, Iannis Aifantis, Paul A Townsend, Petros Sfikakis, Rebecca Fitzgerald, Dimitris Thanos, Jiri Bartek, Russell Petty, Aristotelis Tsirigos (Lead / Corresponding author) & 1 others Vassilis G. Gorgoulis

Research output: Contribution to journalReview article

Abstract

A major challenge in cancer treatment is predicting the clinical response to anti-cancer drugs on a personalized basis. The success of such a task largely depends on the ability to develop computational resources that integrate big "omic" data into effective drug-response models. Machine learning is both an expanding and an evolving computational field that holds promise to cover such needs. Here we provide a focused overview of: 1) the various supervised and unsupervised algorithms used specifically in drug response prediction applications, 2) the strategies employed to develop these algorithms into applicable models, 3) data resources that are fed into these frameworks and 4) pitfalls and challenges to maximize model performance. In this context we also describe a novel in silico screening process, based on Association Rule Mining, for identifying genes as candidate drivers of drug response and compare it with relevant data mining frameworks, for which we generated a web application freely available at: https://compbio.nyumc.org/drugs/. This pipeline explores with high efficiency large sample-spaces, while is able to detect low frequency events and evaluate statistical significance even in the multidimensional space, presenting the results in the form of easily interpretable rules. We conclude with future prospects and challenges of applying machine learning based drug response prediction in precision medicine.

Original languageEnglish
Article number107395
Number of pages28
JournalPharmacology & Therapeutics
DOIs
Publication statusE-pub ahead of print - 30 Jul 2019

Fingerprint

Data Mining
Computer Simulation
Pharmaceutical Preparations
Neoplasms
Precision Medicine
Machine Learning
Genes

Keywords

  • Drug Response Prediction
  • Precision Medicine
  • Data mining
  • Machine Learning Association Rule Mining

Cite this

Vougas, Konstantinos ; Sakelaropoulos, Theodore ; Kotsinas, Athanassios ; Foukas, George-Romanos P ; Ntargaras, Andreas ; Koinis, Filippos ; Polyzos, Alexander ; Myrianthopoulos, Vassilis ; Zhou, Hua ; Narang, Sonali ; Georgoulias, Vassilis ; Alexopoulos, Leonidas ; Aifantis, Iannis ; Townsend, Paul A ; Sfikakis, Petros ; Fitzgerald, Rebecca ; Thanos, Dimitris ; Bartek, Jiri ; Petty, Russell ; Tsirigos, Aristotelis ; Gorgoulis, Vassilis G. / Machine learning and data mining frameworks for predicting drug response in cancer : An overview and a novel in silico screening process based on association rule mining. In: Pharmacology & Therapeutics. 2019.
@article{f3ca70aada53424fa91834fd589254c2,
title = "Machine learning and data mining frameworks for predicting drug response in cancer: An overview and a novel in silico screening process based on association rule mining",
abstract = "A major challenge in cancer treatment is predicting the clinical response to anti-cancer drugs on a personalized basis. The success of such a task largely depends on the ability to develop computational resources that integrate big {"}omic{"} data into effective drug-response models. Machine learning is both an expanding and an evolving computational field that holds promise to cover such needs. Here we provide a focused overview of: 1) the various supervised and unsupervised algorithms used specifically in drug response prediction applications, 2) the strategies employed to develop these algorithms into applicable models, 3) data resources that are fed into these frameworks and 4) pitfalls and challenges to maximize model performance. In this context we also describe a novel in silico screening process, based on Association Rule Mining, for identifying genes as candidate drivers of drug response and compare it with relevant data mining frameworks, for which we generated a web application freely available at: https://compbio.nyumc.org/drugs/. This pipeline explores with high efficiency large sample-spaces, while is able to detect low frequency events and evaluate statistical significance even in the multidimensional space, presenting the results in the form of easily interpretable rules. We conclude with future prospects and challenges of applying machine learning based drug response prediction in precision medicine.",
keywords = "Drug Response Prediction, Precision Medicine, Data mining, Machine Learning Association Rule Mining",
author = "Konstantinos Vougas and Theodore Sakelaropoulos and Athanassios Kotsinas and Foukas, {George-Romanos P} and Andreas Ntargaras and Filippos Koinis and Alexander Polyzos and Vassilis Myrianthopoulos and Hua Zhou and Sonali Narang and Vassilis Georgoulias and Leonidas Alexopoulos and Iannis Aifantis and Townsend, {Paul A} and Petros Sfikakis and Rebecca Fitzgerald and Dimitris Thanos and Jiri Bartek and Russell Petty and Aristotelis Tsirigos and Gorgoulis, {Vassilis G.}",
note = "Financial support was from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grants agreement No. 722729 (SYNTRAIN); the Welfare Foundation for Social & Cultural Sciences (KIKPE), Greece; Pentagon Biotechnology Ltd, UK; DeepMed IO Ltd, UK and NKUA-SARG grants No 70/3/9816, 70/3/12128. Dr. Tsirigos and the NYU Applied Bioinformatics Laboratories (ABL) are partially supported by the Cancer Center Support Grant P30CA016087 at the Laura and Isaac Perlmutter Cancer Center (A.T.).",
year = "2019",
month = "7",
day = "30",
doi = "10.1016/j.pharmthera.2019.107395",
language = "English",
journal = "Pharmacology & Therapeutics",
issn = "0163-7258",
publisher = "Elsevier",

}

Vougas, K, Sakelaropoulos, T, Kotsinas, A, Foukas, G-RP, Ntargaras, A, Koinis, F, Polyzos, A, Myrianthopoulos, V, Zhou, H, Narang, S, Georgoulias, V, Alexopoulos, L, Aifantis, I, Townsend, PA, Sfikakis, P, Fitzgerald, R, Thanos, D, Bartek, J, Petty, R, Tsirigos, A & Gorgoulis, VG 2019, 'Machine learning and data mining frameworks for predicting drug response in cancer: An overview and a novel in silico screening process based on association rule mining', Pharmacology & Therapeutics. https://doi.org/10.1016/j.pharmthera.2019.107395

Machine learning and data mining frameworks for predicting drug response in cancer : An overview and a novel in silico screening process based on association rule mining. / Vougas, Konstantinos; Sakelaropoulos, Theodore; Kotsinas, Athanassios; Foukas, George-Romanos P; Ntargaras, Andreas; Koinis, Filippos; Polyzos, Alexander; Myrianthopoulos, Vassilis; Zhou, Hua; Narang, Sonali; Georgoulias, Vassilis; Alexopoulos, Leonidas; Aifantis, Iannis; Townsend, Paul A; Sfikakis, Petros; Fitzgerald, Rebecca; Thanos, Dimitris; Bartek, Jiri; Petty, Russell; Tsirigos, Aristotelis (Lead / Corresponding author); Gorgoulis, Vassilis G.

In: Pharmacology & Therapeutics, 30.07.2019.

Research output: Contribution to journalReview article

TY - JOUR

T1 - Machine learning and data mining frameworks for predicting drug response in cancer

T2 - An overview and a novel in silico screening process based on association rule mining

AU - Vougas, Konstantinos

AU - Sakelaropoulos, Theodore

AU - Kotsinas, Athanassios

AU - Foukas, George-Romanos P

AU - Ntargaras, Andreas

AU - Koinis, Filippos

AU - Polyzos, Alexander

AU - Myrianthopoulos, Vassilis

AU - Zhou, Hua

AU - Narang, Sonali

AU - Georgoulias, Vassilis

AU - Alexopoulos, Leonidas

AU - Aifantis, Iannis

AU - Townsend, Paul A

AU - Sfikakis, Petros

AU - Fitzgerald, Rebecca

AU - Thanos, Dimitris

AU - Bartek, Jiri

AU - Petty, Russell

AU - Tsirigos, Aristotelis

AU - Gorgoulis, Vassilis G.

N1 - Financial support was from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grants agreement No. 722729 (SYNTRAIN); the Welfare Foundation for Social & Cultural Sciences (KIKPE), Greece; Pentagon Biotechnology Ltd, UK; DeepMed IO Ltd, UK and NKUA-SARG grants No 70/3/9816, 70/3/12128. Dr. Tsirigos and the NYU Applied Bioinformatics Laboratories (ABL) are partially supported by the Cancer Center Support Grant P30CA016087 at the Laura and Isaac Perlmutter Cancer Center (A.T.).

PY - 2019/7/30

Y1 - 2019/7/30

N2 - A major challenge in cancer treatment is predicting the clinical response to anti-cancer drugs on a personalized basis. The success of such a task largely depends on the ability to develop computational resources that integrate big "omic" data into effective drug-response models. Machine learning is both an expanding and an evolving computational field that holds promise to cover such needs. Here we provide a focused overview of: 1) the various supervised and unsupervised algorithms used specifically in drug response prediction applications, 2) the strategies employed to develop these algorithms into applicable models, 3) data resources that are fed into these frameworks and 4) pitfalls and challenges to maximize model performance. In this context we also describe a novel in silico screening process, based on Association Rule Mining, for identifying genes as candidate drivers of drug response and compare it with relevant data mining frameworks, for which we generated a web application freely available at: https://compbio.nyumc.org/drugs/. This pipeline explores with high efficiency large sample-spaces, while is able to detect low frequency events and evaluate statistical significance even in the multidimensional space, presenting the results in the form of easily interpretable rules. We conclude with future prospects and challenges of applying machine learning based drug response prediction in precision medicine.

AB - A major challenge in cancer treatment is predicting the clinical response to anti-cancer drugs on a personalized basis. The success of such a task largely depends on the ability to develop computational resources that integrate big "omic" data into effective drug-response models. Machine learning is both an expanding and an evolving computational field that holds promise to cover such needs. Here we provide a focused overview of: 1) the various supervised and unsupervised algorithms used specifically in drug response prediction applications, 2) the strategies employed to develop these algorithms into applicable models, 3) data resources that are fed into these frameworks and 4) pitfalls and challenges to maximize model performance. In this context we also describe a novel in silico screening process, based on Association Rule Mining, for identifying genes as candidate drivers of drug response and compare it with relevant data mining frameworks, for which we generated a web application freely available at: https://compbio.nyumc.org/drugs/. This pipeline explores with high efficiency large sample-spaces, while is able to detect low frequency events and evaluate statistical significance even in the multidimensional space, presenting the results in the form of easily interpretable rules. We conclude with future prospects and challenges of applying machine learning based drug response prediction in precision medicine.

KW - Drug Response Prediction

KW - Precision Medicine

KW - Data mining

KW - Machine Learning Association Rule Mining

U2 - 10.1016/j.pharmthera.2019.107395

DO - 10.1016/j.pharmthera.2019.107395

M3 - Review article

JO - Pharmacology & Therapeutics

JF - Pharmacology & Therapeutics

SN - 0163-7258

M1 - 107395

ER -