A two-step feature selection procedure for relevant markers of Squamous Cell Lung Carcinoma using different survival models

Atanu Bhattacharjee, Samudranil Basak, Pragya Kumari

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)
14 Downloads (Pure)

Abstract

There are potentially infinite gene expression markers for Lung Squamous Cell Carcinoma. This results in a high-dimensional data with a large number of features. The selection of relevant markers for analysis is thus, of utmost importance. In our study, we have aimed to select a subset of prominent and significant features from 31918 features of gene expressions. Analysis is then performed on the selected features using the Cox Proportional Hazards Model to know how each marker affects the survival estimates of a patient. We have employed a two-step selection process to select a subset of markers. The first step is done by L1 regularized Cox PH. Then the selected markers are screened a second time by running a univariate Cox PH model and checking for the p-value of each bio-marker via Wald inference (p<0.05). Once the final selection is made, we estimate the Hazard Ratio and Confidence intervals using Maximum Likelihood Estimates (MLE) and the Bayesian Approach with the Cox Proportional Hazards Model (CPH) and the Accelerated Failure Time Model (AFT) as an alternative. A forest plot has also been generated to show the graphical representation of the meta-analysis done in the study. With the proposed selection procedure we have managed to find a suitable subset out of a large number of variables available. The features selected have been analyzed and their validity has been confirmed by using survival models.

Original languageEnglish
Article number100168
Number of pages7
JournalHealthcare Analytics
Volume3
DOIs
Publication statusPublished - Nov 2023

Keywords

  • Accelerated Failure Time Model
  • Cox Proportional Hazard Model
  • Feature selection
  • High-dimensional
  • Lasso Cox Model
  • Lung Cancer

ASJC Scopus subject areas

  • Analytical Chemistry
  • Health Informatics

Fingerprint

Dive into the research topics of 'A two-step feature selection procedure for relevant markers of Squamous Cell Lung Carcinoma using different survival models'. Together they form a unique fingerprint.

Cite this