University of Leicester
Browse

Replicating a COVID-19 study in a national England database to assess the generalisability of research with regional electronic health record data

Download (705.61 kB)
journal contribution
posted on 2025-09-08, 10:27 authored by Richard Williams, David Jenkins, Thomas Bolton, Adrian Heald, Mehrdad Mizani, Matthew Sperrin, Niels Peek
Objectives To assess the degree to which we can replicate a study between a regional and a national database of electronic health record data in the UK. The original study examined the risk factors associated with hospitalisation following COVID-19 infection in people with diabetes. Design A replication of a retrospective cohort study. Setting Observational electronic health record data from primary and secondary care sources in the UK. The original study used data from a large, urbanised region (Greater Manchester Care Record, Greater Manchester, UK—2.8 m patients). This replication study used a national database covering the whole of England, UK (NHS England’s Secure Data Environment service for England, accessed via the BHF Data Science Centre’s CVD-COVID-UK/COVID-IMPACT Consortium—54 m patients). Participants Individuals with a diagnosis of type 1 diabetes or type 2 diabetes prior to a positive COVID-19 test result. The matched controls (3:1) were individuals who had a positive COVID-19 test result, but who did not have a diagnosis of diabetes on the date of their positive COVID-19 test result. Matching was done on age at COVID-19 diagnosis, sex and approximate date of COVID-19 test. Primary and secondary outcome measures Hospitalisation within 28 days of a positive COVID-19 test. Results We found that many of the effect sizes did not show a statistically significant difference, but that some did. Where effect sizes were statistically significant in the regional study, then they remained significant in the national study and the effect size was the same direction and of similar magnitude. Conclusions There is some evidence that the findings from studies in smaller regional datasets can be extrapolated to a larger, national setting. However, there were some differences, and therefore replication studies remain an essential part of healthcare research.<p></p>

History

Author affiliation

College of Life Sciences Medical Sciences

Version

  • VoR (Version of Record)

Published in

BMJ Open

Volume

15

Issue

4

Pagination

e093080

Publisher

BMJ

issn

2044-6055

eissn

2044-6055

Copyright date

2025

Available date

2025-09-08

Spatial coverage

England

Language

en

Deposited by

Professor Anna Hansell

Deposit date

2025-08-21

Data Access Statement

Data may be obtained from a third party and are not publicly available. The data used in this study are available in NHS England’s SDE service for England, but as restrictions apply they are not publicly available (https://digital.nhs.uk/coronavirus/coronavirus-data-services-updates/trusted-research-environment-service-for-england). The CVD-COVID-UK/COVID-IMPACT programme led by the BHF Data Science Centre (https://bhfdatasciencecentre.org) received approval to access data in NHS England’s SDE service for England from the Independent Group Advising on the Release of Data (IGARD) (https://digital.nhs.uk/about-nhs-digital/corporate-information-and-documents/independent-group-advising-on-the-release-of-data) via an application made in the Data Access Request Service (DARS) Online system (ref. DARS-NIC-381078-Y9C5K) (https://digital.nhs.uk/services/data-access-request-service-dars/dars-products-and-services). The CVD-COVID-UK/COVID-IMPACT Approvals & Oversight Board (https://bhfdatasciencecentre.org/areas/cvd-covid-uk-covid-impact/) subsequently granted approval to this project to access the data within NHS England’s SDE service for England. The de-identified data used in this study were made available to accredited researchers only.