posted on 2024-07-04, 11:41authored byKA Fawcett, G Demidov, N Shrine, ML Paynton, S Ossowski, I Sayers, LV Wain, EJ Hollox
Background: The role of copy number variants (CNVs) in susceptibility to asthma is not well understood. This is, in part, due to the difficulty of accurately measuring CNVs in large enough sample sizes to detect associations. The recent availability of whole-exome sequencing (WES) in large biobank studies provides an unprecedented opportunity to study the role of CNVs in asthma. Methods: We called common CNVs in 49,953 individuals in the first release of UK Biobank WES using ClinCNV software. CNVs were tested for association with asthma in a stage 1 analysis comprising 7098 asthma cases and 36,578 controls from the first release of sequencing data. Nominally-associated CNVs were then meta-analysed in stage 2 with an additional 17,280 asthma cases and 115,562 controls from the second release of UK Biobank exome sequencing, followed by validation and fine-mapping. Results: Five of 189 CNVs were associated with asthma in stage 2, including a deletion overlapping the HLA-DQA1 and HLA-DQB1 genes, a duplication of CHROMR/PRKRA, deletions within MUC22 and TAP2, and a duplication in FBRSL1. The HLA-DQA1, HLA-DQB1, MUC22 and TAP2 genes all reside within the human leukocyte antigen (HLA) region on chromosome 6. In silico analyses demonstrated that the deletion overlapping HLA-DQA1 and HLA-DQB1 is likely to be an artefact arising from under-mapping of reads from non-reference HLA haplotypes, and that the CHROMR/PRKRA and FBRSL1 duplications represent presence/absence of pseudogenes within the HLA region. Bayesian fine-mapping of the HLA region suggested that there are two independent asthma association signals. The variants with the largest posterior inclusion probability in the two credible sets were an amino acid change in HLA-DQB1 (glutamine to histidine at residue 253) and a multi-allelic amino acid change in HLA-DRB1 (presence/absence of serine, glycine or leucine at residue 11). Conclusions: At least two independent loci characterised by amino acid changes in the HLA-DQA1, HLA-DQB1 and HLA-DRB1 genes are likely to account for association of SNPs and CNVs in this region with asthma. The high divergence of haplotypes in the HLA can give rise to spurious CNVs, providing an important, cautionary tale for future large-scale analyses of sequencing data.
Funding
Asthma UK Fellowship (AUK-CDA-2019–414)
GSK / British Lung Foundation Chair in Respiratory Research (C17-1)
History
Citation
Fawcett, K.A., Demidov, G., Shrine, N. et al. Exome-wide analysis of copy number variation shows association of the human leukocyte antigen region with asthma in UK Biobank. BMC Med Genomics 15, 119 (2022). https://doi.org/10.1186/s12920-022-01268-y
Author affiliation
College of Life Sciences, Population Health Sciences
All data (summary statistics) generated or analysed during this study are included in this published article [and its supplementary information files]. The raw data that support the findings of this study are available from the UK Biobank but restrictions apply to the availability of these data, which were used under approved project 56607 for the current study, and so are not publicly available. Data are however available from UK Biobank (see https://www.ukbiobank.ac.uk/enable-your-research for the application procedure). The publicly available datasets analysed during the current study are available in the 1000 genomes phase 3 structural variant dataset (http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/integrated_sv_map/supporting/GRCh38_positions/ALL.wgs.mergedSV.v8.20130502.svs.genotypes.GRCh38.vcf.gz), and the NCBI's database of human genomic Structural Variation (https://www.ncbi.nlm.nih.gov/dbvar) under accession nstd162 and nstd152. We also used the following databases: the NCBI gene database (https://www.ncbi.nlm.nih.gov/gene/), the Broad CNV Browser (http://www.broadinstitute.org/software/genomestrip/mcnv_supplementary_data), the Database of Genomic Variation (http://dgv.tcag.ca/dgv/app/home) and the genome aggregation database (https://gnomad.broadinstitute.org/).