Validation of the Hospital Episode Statistics Outpatient Dataset in England.
Thorn JC., Turner E., Hounsome L., Walsh E., Donovan JL., Verne J., Neal DE., Hamdy FC., Martin RM., Noble SM.
OBJECTIVES: The Hospital Episode Statistics (HES) dataset is a source of administrative 'big data' with potential for costing purposes in economic evaluations alongside clinical trials. This study assesses the validity of coverage in the HES outpatient dataset. METHODS: Men who died of, or with, prostate cancer were selected from a prostate-cancer screening trial (CAP, Cluster randomised triAl of PSA testing for Prostate cancer). Details of visits that took place after 1/4/2003 to hospital outpatient departments for conditions related to prostate cancer were extracted from medical records (MR); these appointments were sought in the HES outpatient dataset based on date. The matching procedure was repeated for periods before and after 1/4/2008, when the HES outpatient dataset was accredited as a national statistic. RESULTS: 4922 outpatient appointments were extracted from MR for 370 men. 4088 appointments recorded in MR were identified in the HES outpatient dataset (83.1%; 95% confidence interval [CI] 82.0-84.1). For appointments occurring prior to 1/4/2008, 2195/2755 (79.7%; 95% CI 78.2-81.2) matches were observed, while 1893/2167 (87.4%; 95% CI 86.0-88.9) appointments occurring after 1/4/2008 were identified (p for difference <0.001). 215/370 men (58.1%) had at least one appointment in the MR review that was unmatched in HES, 155 men (41.9%) had all their appointments identified, and 20 men (5.4%) had no appointments identified in HES. CONCLUSIONS: The HES outpatient dataset appears reasonably valid for research, particularly following accreditation. The dataset may be a suitable alternative to collecting MR data from hospital notes within a trial, although caution should be exercised with data collected prior to accreditation.