Automating literature search and integrating meta-GWAS and eQTL data to uncover variants of interest in IPF
The first part of this project concerned the development of an automated literature searching tool, LitSpy. LitSpy addresses an established research need to reliably search biomedical literature for multiple targets (e.g. a list of genes from a large ‘omics study) with good precision and recall. LitSpy outperforms other freely available literature searching methods with regards to recall, precision, and reliability. The full code for LitSpy is available at https://github.com/ecroot/LitSpy, and the Python package is available at https://pypi.org/project/litspy/.
In the second part of this project, meta-GWAS and eQTL data were integrated to uncover variants of interest in the lung disease idiopathic pulmonary fibrosis (IPF). IPF is a lung disease with poor prognosis, increasing diagnosis rates, and limited treatment options, constituting a substantial research need. A recent meta-GWAS (Allen et al., 2020) identified fourteen genomic signals associated with IPF risk, three of which were novel. Here, these fourteen risk signals were colocalised with eQTL data from three different sources to identify genes whose differential expression was likely to be driven by the same variants driving IPF risk. Overall, 24 genes were identified.
Analysing these 24 genes with LitSpy indicated that some were well-established IPF risk-associated genes, whereas others were novel genes of interest in the context of lung and/or IPF research.
History
Supervisor(s)
Athanasios Didangelos; Louise Wain; Richard BadgeDate of award
2024-01-16Author affiliation
Department of Genetics and Genome BiologyAwarding institution
University of LeicesterQualification level
- Doctoral
Qualification name
- PhD