Related papers: shapiq: Shapley Interactions for Machine Learning

shapiq: Shapley Interactions for Machine Learning

URL: http://arxiv.org/abs/2410.01649v1
Date: Wed, 2 Oct 2024 15:16:53 GMT
Title: shapiq: Shapley Interactions for Machine Learning
Authors: Maximilian Muschalik, Hubert Baniecki, Fabian Fumagalli, Patrick Kolpaczki, Barbara Hammer, Eyke Hüllermeier,
Abstract summary: We introduce shapiq, an open-source Python package that unifies state-of-the-art algorithms to efficiently compute Shapley Value (SV) and Shapley Interactions (SIs) For practitioners, shapiq is able to explain and visualize any-order feature interactions in predictions of models, including vision transformers, language models, as well as XGBoost and LightGBM with TreeShap-IQ.
Score: 21.939393765684827
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Originally rooted in game theory, the Shapley Value (SV) has recently become an important tool in machine learning research. Perhaps most notably, it is used for feature attribution and data valuation in explainable artificial intelligence. Shapley Interactions (SIs) naturally extend the SV and address its limitations by assigning joint contributions to groups of entities, which enhance understanding of black box machine learning models. Due to the exponential complexity of computing SVs and SIs, various methods have been proposed that exploit structural assumptions or yield probabilistic estimates given limited resources. In this work, we introduce shapiq, an open-source Python package that unifies state-of-the-art algorithms to efficiently compute SVs and any-order SIs in an application-agnostic framework. Moreover, it includes a benchmarking suite containing 11 machine learning applications of SIs with pre-computed games and ground-truth values to systematically assess computational performance across domains. For practitioners, shapiq is able to explain and visualize any-order feature interactions in predictions of models, including vision transformers, language models, as well as XGBoost and LightGBM with TreeSHAP-IQ. With shapiq, we extend shap beyond feature attributions and consolidate the application of SVs and SIs in machine learning that facilitates future research. The source code and documentation are available at https://github.com/mmschlk/shapiq.

Related papers

Exact Computation of Any-Order Shapley Interactions for Graph Neural Networks [53.10674067060148]
Shapley Interactions (SIs) quantify node contributions and interactions among multiple nodes. By exploiting the GNN architecture, we show that the structure of interactions in node embeddings are preserved for graph prediction. We introduce GraphSHAP-IQ, an efficient approach to compute any-order SIs exactly.
arXiv Detail & Related papers (2025-01-28T13:37:44Z)
Medical artificial intelligence toolbox (MAIT): an explainable machine learning framework for binary classification, survival modelling, and regression analyses [0.0]
Medical Artificial Intelligence Toolbox (MAIT) is an explainable, open-source Python pipeline for developing and evaluating binary classification, regression, and survival models. MAIT addresses key challenges (e.g., high dimensionality, class imbalance, mixed variable types, and missingness) while promoting transparency in reporting. We provide detailed tutorials on GitHub, using four open-access data sets, to demonstrate how MAIT can be used to improve implementation and interpretation of ML models in medical research.
arXiv Detail & Related papers (2025-01-08T14:51:36Z)
NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals [58.83169560132308]
We introduce NNsight and NDIF, technologies that work in tandem to enable scientific study of the representations and computations learned by very large neural networks.
arXiv Detail & Related papers (2024-07-18T17:59:01Z)
Coupling Machine Learning with Ontology for Robotics Applications [0.0]
The lack of availability of prior knowledge in dynamic scenarios is without doubt a major barrier for scalable machine intelligence. My view of the interaction between the two tiers intelligence is based on the idea that when knowledge is not readily available at the knowledge base tier, more knowledge can be extracted from the other tier.
arXiv Detail & Related papers (2024-06-08T23:38:03Z)
Zero-knowledge Proof Meets Machine Learning in Verifiability: A Survey [19.70499936572449]
High-quality models rely not only on efficient optimization algorithms but also on the training and learning processes built upon vast amounts of data and computational power. Due to various challenges such as limited computational resources and data privacy concerns, users in need of models often cannot train machine learning models locally. This paper presents a comprehensive survey of zero-knowledge proof-based verifiable machine learning (ZKP-VML) technology.
arXiv Detail & Related papers (2023-10-23T12:15:23Z)
A didactic approach to quantum machine learning with a single qubit [68.8204255655161]
We focus on the case of learning with a single qubit, using data re-uploading techniques. We implement the different proposed formulations in toy and real-world datasets using the qiskit quantum computing SDK.
arXiv Detail & Related papers (2022-11-23T18:25:32Z)
Advancing Reacting Flow Simulations with Data-Driven Models [50.9598607067535]
Key to effective use of machine learning tools in multi-physics problems is to couple them to physical and computer models. The present chapter reviews some of the open opportunities for the application of data-driven reduced-order modeling of combustion systems.
arXiv Detail & Related papers (2022-09-05T16:48:34Z)
Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points. The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains. We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z)
DMCNet: Diversified Model Combination Network for Understanding Engagement from Video Screengrabs [0.4397520291340695]
Engagement plays a major role in developing intelligent educational interfaces. Non-deep learning models are based on the combination of popular algorithms such as Histogram of Oriented Gradient (HOG), Support Vector Machine (SVM), Scale Invariant Feature Transform (SIFT) and Speeded Up Robust Features (SURF) The deep learning methods include Densely Connected Convolutional Networks (DenseNet-121), Residual Network (ResNet-18) and MobileNetV1.
arXiv Detail & Related papers (2022-04-13T15:24:38Z)
SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines. This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)
KILT: a Benchmark for Knowledge Intensive Language Tasks [102.33046195554886]
We present a benchmark for knowledge-intensive language tasks (KILT) All tasks in KILT are grounded in the same snapshot of Wikipedia. We find that a shared dense vector index coupled with a seq2seq model is a strong baseline.
arXiv Detail & Related papers (2020-09-04T15:32:19Z)
A Theory of Usable Information Under Computational Constraints [103.5901638681034]
We propose a new framework for reasoning about information in complex systems. Our foundation is based on a variational extension of Shannon's information theory. We show that by incorporating computational constraints, $mathcalV$-information can be reliably estimated from data.
arXiv Detail & Related papers (2020-02-25T06:09:30Z)
PHOTONAI -- A Python API for Rapid Machine Learning Model Development [2.414341608751139]
PHOTONAI is a high-level Python API designed to simplify and accelerate machine learning model development. It functions as a unifying framework allowing the user to easily access and combine algorithms from different toolboxes into custom algorithm sequences.
arXiv Detail & Related papers (2020-02-13T10:33:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.