University of Leicester
Browse

Investigating the association of environmental exposures and all-cause mortality in the UK Biobank using sparse principal component analysis

Download (2.23 MB)
journal contribution
posted on 2022-07-01, 14:23 authored by Mohammad Mamouei, Yajie Zhu, Milad Nazarzadeh, Abdelaali Hassaine, Gholamreza Salimi-Khorshidi, Yutong Cai, Kazem Rahimi

tMulticollinearity refers to the presence of collinearity between multiple variables and renders the results of statistical inference erroneous (Type II error). This is particularly important in environmental health research where multicollinearity can hinder inference. To address this, correlated variables are often excluded from the analysis, limiting the discovery of new associations. An alternative approach to address this problem is the use of principal component analysis. This method, combines and projects a group of correlated variables onto a new orthogonal space. While this resolves the multicollinearity problem, it poses another challenge in relation to interpretability of results. Standard hypothesis testing methods can be used to evaluate the association of projected predictors, called principal components, with the outcomes of interest, however, there is no established way to trace the significance of principal components back to individual variables. To address this problem, we investigated the use of sparse principal component analysis which enforces a parsimonious projection. We hypothesise that this parsimony could facilitate the interpretability of findings. To this end, we investigated the association of 20 environmental predictors with all-cause mortality adjusting for demographic, socioeconomic, physiological, and behavioural factors. The study was conducted in a cohort of 379,690 individuals in the UK. During an average follow-up of 8.05 years (3,055,166 total person-years), 14,996 deaths were observed. We used Cox regression models to estimate the hazard ratio (HR) and 95% confidence intervals (CI). The Cox models were fitted to the standardised environmental predictors (a) without any transformation (b) transformed with PCA, and (c) transformed with SPCA. The comparison of findings underlined the potential of SPCA for conducting inference in scenarios where multicollinearity can increase the risk of Type II error. Our analysis unravelled a significant association between average noise pollution and increased risk of all-cause mortality. Specifically, those in the upper deciles of noise exposure have between 5 and 10% increased risk of all-cause mortality compared to the lowest decile.

Funding

PEAK Urban programme, funded by UKRI’s Global Challenge Research Fund (Grant Number: ES/P011055/1

History

Citation

Mamouei, M., Zhu, Y., Nazarzadeh, M. et al. Investigating the association of environmental exposures and all-cause mortality in the UK Biobank using sparse principal component analysis. Sci Rep 12, 9239 (2022). https://doi.org/10.1038/s41598-022-13362-3

Author affiliation

Centre for Environmental Health and Sustainability

Version

  • VoR (Version of Record)

Published in

Scientific Reports

Volume

12

Pagination

9239

Publisher

Springer Science and Business Media LLC

eissn

2045-2322

Acceptance date

2022-05-13

Copyright date

2022

Available date

2022-07-01

Language

en

Usage metrics

    University of Leicester Publications

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC