METHODS FOR EPIDEMIOLOGIC DATA WITH MISSING VALUES
Biography
Overview
The long term objective of the research in this resubmission of a FIRST Award proposal is to provide investigators with new methodology for handling missing data in research applications. The aims of this FIRST Award will be 1) to develop and extend statistical methodology for propensity score estimation when predictors contain missing data and 2) to apply this methodology to a variety of applied data sets. To address these aims several goals are proposed. These include: developing and extending methodology for estimating propensity scores when predictors contain missing data; developing methodology that allows predictors to contain missing data that may not be "missing at random"; developing diagnostics to assess the validity and fit for competing models; developing user friendly software for implementing this methodology; and applying these methods to four real data applications. These applications include: 1) using data from the Framingham Heart Study to estimate risk appraisal functions for predicting such outcomes as cardiovascular disease or stroke for individuals with missing risk factors; 2) using data from a diabetes registry consisting of over 120,000 members, provided by the Division of Research at Kaiser Permanente, Northern California, to develop techniques that aid in identifying diabetic persons who are at high risk for developing diabetic related complications in the presence of missing risk factor information; 3) using data from the Postmenopausal Estrogen/Progestin Intervention (PEPI) clinical trial to fit models to estimate propensity scores which represent the probability of medication adherence conditional on predictors that may contain missing data, and then use these propensity scores to find adjusted estimates of the effects of hormone therapy on cardiovascular disease risk factors, bone mineral density, and other symptoms and; 4) using data from the Genetic Epidemiology of Adenomatous Polyps study to fit models estimating the relationship of specific genes to outcomes considering the presence of missing risk factors. The results from this research will make important contributions to medical, epidemiological and statistical research. Methodological and applied publications are anticipated as statistical methodology concerning missing values will be developed and extended. In addition, substantive medical and epidemiological questions will be answered using this new methodology on the applied data sets provided.
Time