Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 90 - Dealing with Error-Prone Electronic Health Record Data via Validation Sampling
Type: Invited
Date/Time: Monday, August 8, 2022 : 8:30 AM to 10:20 AM
Sponsor: Biometrics Section
Abstract #320421
Title: Advantages of Multi-Wave, Multi-Frame Sampling Designs for Analysis of Error-Prone Data from Electronic Health Records
Author(s): Pamela Shaw* and Thomas Lumley and Bryan Shepherd
Companies: Kaiser Permanente Washington Health Research Institute and University of Auckland and Vanderbilt University Medical Center
Keywords: Complex-survey design; EHR; calibration; measurement error; optimal design
Abstract:

Large epidemiologic studies often rely on data sources that are error prone, such as those reliant on electronic health records. Data errors in even a single covariate can bias multiple regression coefficients, including bias in coefficients of precisely measured variables. Errorprone outcome variables can be an additional source of bias, particularly when that error is related to other regression variables. Validation of a subsample of records can be a practical way to obtain data regarding the nature of the errors, which can inform statistical adjustment methods to avoid error-induced biases. Design-based estimation methods are attractive in settings where errors in multiple variables may be too complex to model reliably. Efficiency of these estimators can be improved by sampling more informative subjects into the validation subset. This talk will present strategies to improve the efficiency of design-based estimators, which includes generalized raking, multi-wave sampling, and the application of the multi-frame approach of Metcalf and Scott 2009 to accommodate multiple outcomes of interest. Concepts are demonstrated with numerical studies and application to real data.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program