University of Leicester
446581.pdf (263.08 kB)

Accuracy of Gene Scores when Pruning Markers by Linkage Disequilibrium.

Download (263.08 kB)
journal contribution
posted on 2019-10-22, 11:50 authored by Frank Dudbridge, Paul J. Newcombe
OBJECTIVE: Gene scores are often used to model the combined effects of genetic variants. When variants are in linkage disequilibrium, it is common to prune all variants except the most strongly associated. This avoids duplicating information but discards information when variants have independent effects. However, joint modelling of correlated variants increases the sampling error in the gene score. In recent applications, joint modelling has offered only small improvements in accuracy over pruning. We aimed to quantify the relationship between pruning and joint modelling in relation to sample size. METHODS: We derived the coefficient of determination R2 for a gene score constructed from pruned markers, and for one constructed from correlated markers with jointly estimated effects. RESULTS: Pruned scores tend to have slightly lower R2 than jointly modelled scores, but the differences are small at sample sizes up to 100,000. If the proportion of correlated variants is high, joint modelling can obtain modest improvements asymptotically. CONCLUSIONS: The small gains observed to date from joint modelling can be explained by sample size. As studies become larger, joint modelling will be useful for traits affected by many correlated variants, but the improvements may remain small. Pruning remains a useful heuristic for current studies.


This work was funded by the MRC (MR/K006215/1).



Human Heredity, 2015, 80 (4), pp. 178-186

Author affiliation

/Organisation/COLLEGE OF LIFE SCIENCES/School of Medicine/Department of Health Sciences


  • VoR (Version of Record)

Published in

Human Heredity


Karger Publishers



Copyright date


Available date


Publisher version