Imputation-Powered Inference for Missing Covariates
Miss Junting Duan
Ph.D. Candidate in Management Science and Engineering
Department of Management Science and Engineering
Stanford University
Missing covariate data is a prevalent problem in empirical research. We provide a novel framework for handling missing covariate data for estimation and inference in downstream tasks. Our general framework provides an automatic and easy-to-use pipeline for empirical researchers: First, missing values are imputed using virtually any imputation method under general observation patterns. Second, we automatically correct for the imputation bias and adaptively weight the imputed values according to their quality. Third, we use all available data, including imputed observations, to obtain more precise point estimates for the downstream task with valid confidence intervals. Our approach ensures valid inference while improving statistical efficiency by leveraging all available data. We establish the asymptotic normality of the proposed estimator under general missing data patterns and a broad class of imputation methods. Through simulations, we demonstrate the superior performance of our approach over natural benchmarks, as it achieves both lower bias and variance while being robust to imputation quality. In a comprehensive empirical study of the dependence of equity markets on carbon emissions, we show that properly accounting for missing emissions data yields no evidence of correlation between stock returns and emissions directly produced by companies, but a negative correlation with value chain emissions.
















