Efficient Statistical Inference for Population Variable Importance Using Shapley Values

Jul 12, 2020



The true importance of a variable in a prediction task provides useful knowledge about the underlying data-generating mechanism and can help in deciding which measurements to collect in subsequent experiments. Existing approaches often define population variable importance as the difference between the oracle prediction risk with and without the feature. However, these measures are difficult to interpret for correlated features, which can be assigned low importance even if they are highly predictive. To this end, we propose defining population variable importance using the Shapley value instead, which averages the predictive value of a feature relative to all possible feature subsets. Given n training observations, we present a computationally tractable statistical inference procedure that estimates the Shapley Population Variable Importance Measure (SPVIM) at an asymptotically optimal rate using only m = Θ(n) randomly sampled feature subsets. We derive the asymptotic distribution of this estimator to construct valid confidence intervals and hypothesis tests. Finally, we analyze the importance of lab measurements for predicting in-hospital mortality and find that i) our procedure is significantly faster than existing sampling-based approaches and ii) gives more consistent estimates across different modeling procedures.



About ICML 2020

The International Conference on Machine Learning (ICML) is the premier gathering of professionals dedicated to the advancement of the branch of artificial intelligence known as machine learning. ICML is globally renowned for presenting and publishing cutting-edge research on all aspects of machine learning used in closely related areas like artificial intelligence, statistics and data science, as well as important application areas such as machine vision, computational biology, speech recognition, and robotics. ICML is one of the fastest growing artificial intelligence conferences in the world. Participants at ICML span a wide range of backgrounds, from academic and industrial researchers, to entrepreneurs and engineers, to graduate students and postdocs.

Store presentation

Should this presentation be stored for 1000 years?

How do we store presentations

Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%


Recommended Videos

Presentations on similar topic, category or speaker