Mathematics, Vol. 13, Pages 1581: The Shapley Value in Data Science: Advances in Computation, Extensions, and Applications


Mathematics, Vol. 13, Pages 1581: The Shapley Value in Data Science: Advances in Computation, Extensions, and Applications

Mathematics doi: 10.3390/math13101581

Authors:
Lei Qin
Yingqiu Zhu
Shaonan Liu
Xingjian Zhang
Yining Zhao

The Shapley value is a fundamental concept in data science, providing a principled framework for fair resource allocation, feature importance quantification, and improved interpretability of complex models. Its fundamental theory is based on four axiomatic proper ties, which underpin its widespread application. To address the inherent computational challenges of exact calculation, we discuss model-agnostic approximation techniques, such as Random Order Value, Least Squares Value, and Multilinear Extension Sampling, as well as specialized fast algorithms for linear, tree-based, and deep learning models. Recent extensions, such as Distributional Shapley and Weighted Shapley, have broadened the applications to data valuation, reinforcement learning, feature interaction analysis, and multi-party cooperation. Practical effectiveness has been demonstrated in health care, finance, industry, and the digital economy, with promising future directions for incorporating these techniques into emerging fields, such as data asset pricing and trading.



Source link

Lei Qin www.mdpi.com