Abstract
MOTIVATION: Genome-wide association studies (GWAS) have enabled large-scale analysis of the role of genetic variants in human disease. Despite impressive methodological advances, subsequent clinical interpretation and application remains challenging when GWAS suffer from a lack of statistical power. In recent years, however, the use of information diffusion algorithms with molecular networks has led to fruitful insights on disease genes.
RESULTS: We present an overview of the design choices and pitfalls that prove crucial in the application of network propagation methods to GWAS summary statistics. We highlight general trends from the literature, and present benchmark experiments to expand on these insights selecting as case study three diseases and five molecular networks. We verify that the use of gene-level scores based on GWAS P-values offers advantages over the selection of a set of 'seed' disease genes not weighted by the associated P-values if the GWAS summary statistics are of sufficient quality. Beyond that, the size and the density of the networks prove to be important factors for consideration. Finally, we explore several ensemble methods and show that combining multiple networks may improve the network propagation approach.
Original language | English |
---|---|
Article number | bbae014 |
Pages (from-to) | 1-13 |
Number of pages | 13 |
Journal | Briefings in bioinformatics |
Volume | 25 |
Issue number | 2 |
Early online date | 10 Feb 2024 |
DOIs | |
Publication status | Published - Mar 2024 |
Keywords
- GWAS
- disease gene
- molecular network
- network propagation
ASJC Scopus subject areas
- Information Systems
- Molecular Biology