Thinking Causally in High Dimensions
Abstract: With the availability of ever-increasingly large observational databases in epidemiological and medical studies, we seek to understand how the classical potential outcomes framework and the attendant causal inferential procedures can be applied to this setting. While there has been a lot of work on statistical methods for high-dimensional data, we argue that there are aspects about causal inference that make the problem more challenging. In this talk, we will describe three non-intuitive findings:
- The ‘treatment positivity’ assumption from causal inference becomes less innocuous in higher dimensions.
- Margin theory from machine learning can be used in high-dimensional causal problems;
- Gradient boosting yields a powerful tool for causal effect estimation.
This is joint work with Efrén Cruz-Cortés (Penn State University), Kevin Josey (Harvard School of Public Health), Elizabeth Juaréz-Colunga (University of Colorado) and Fan Yang (University of Colorado).