Grouping Shapley Value Feature Importances of Random Forests for
explainable Yield Prediction
- URL: http://arxiv.org/abs/2304.07111v1
- Date: Fri, 14 Apr 2023 13:03:33 GMT
- Title: Grouping Shapley Value Feature Importances of Random Forests for
explainable Yield Prediction
- Authors: Florian Huber, Hannes Engler, Anna Kicherer, Katja Herzog, Reinhard
T\"opfer, Volker Steinhage
- Abstract summary: We explain the concept of Shapley values directly computed for groups of features and introduce an algorithm to compute them efficiently on tree structures.
We provide a blueprint for designing swarm plots that combine many local explanations for global understanding.
- Score: 0.8543936047647136
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Explainability in yield prediction helps us fully explore the potential of
machine learning models that are already able to achieve high accuracy for a
variety of yield prediction scenarios. The data included for the prediction of
yields are intricate and the models are often difficult to understand. However,
understanding the models can be simplified by using natural groupings of the
input features. Grouping can be achieved, for example, by the time the features
are captured or by the sensor used to do so. The state-of-the-art for
interpreting machine learning models is currently defined by the game-theoretic
approach of Shapley values. To handle groups of features, the calculated
Shapley values are typically added together, ignoring the theoretical
limitations of this approach. We explain the concept of Shapley values directly
computed for predefined groups of features and introduce an algorithm to
compute them efficiently on tree structures. We provide a blueprint for
designing swarm plots that combine many local explanations for global
understanding. Extensive evaluation of two different yield prediction problems
shows the worth of our approach and demonstrates how we can enable a better
understanding of yield prediction models in the future, ultimately leading to
mutual enrichment of research and application.
Related papers
- Variational Shapley Network: A Probabilistic Approach to Self-Explaining
Shapley values with Uncertainty Quantification [2.6699011287124366]
Shapley values have emerged as a foundational tool in machine learning (ML) for elucidating model decision-making processes.
We introduce a novel, self-explaining method that simplifies the computation of Shapley values significantly, requiring only a single forward pass.
arXiv Detail & Related papers (2024-02-06T18:09:05Z) - Evaluating and Explaining Large Language Models for Code Using Syntactic
Structures [74.93762031957883]
This paper introduces ASTxplainer, an explainability method specific to Large Language Models for code.
At its core, ASTxplainer provides an automated method for aligning token predictions with AST nodes.
We perform an empirical evaluation on 12 popular LLMs for code using a curated dataset of the most popular GitHub projects.
arXiv Detail & Related papers (2023-08-07T18:50:57Z) - TsSHAP: Robust model agnostic feature-based explainability for time
series forecasting [6.004928390125367]
We propose a feature-based explainability algorithm, TsSHAP, that can explain the forecast of any black-box forecasting model.
We formalize the notion of local, semi-local, and global explanations in the context of time series forecasting.
arXiv Detail & Related papers (2023-03-22T05:14:36Z) - Distributional Gradient Boosting Machines [77.34726150561087]
Our framework is based on XGBoost and LightGBM.
We show that our framework achieves state-of-the-art forecast accuracy.
arXiv Detail & Related papers (2022-04-02T06:32:19Z) - Exact Shapley Values for Local and Model-True Explanations of Decision
Tree Ensembles [0.0]
We consider the application of Shapley values for explaining decision tree ensembles.
We present a novel approach to Shapley value-based feature attribution that can be applied to random forests and boosted decision trees.
arXiv Detail & Related papers (2021-12-16T20:16:02Z) - Instance-Based Neural Dependency Parsing [56.63500180843504]
We develop neural models that possess an interpretable inference process for dependency parsing.
Our models adopt instance-based inference, where dependency edges are extracted and labeled by comparing them to edges in a training set.
arXiv Detail & Related papers (2021-09-28T05:30:52Z) - Complex Event Forecasting with Prediction Suffix Trees: Extended
Technical Report [70.7321040534471]
Complex Event Recognition (CER) systems have become popular in the past two decades due to their ability to "instantly" detect patterns on real-time streams of events.
There is a lack of methods for forecasting when a pattern might occur before such an occurrence is actually detected by a CER engine.
We present a formal framework that attempts to address the issue of Complex Event Forecasting.
arXiv Detail & Related papers (2021-09-01T09:52:31Z) - An Interpretable Probabilistic Model for Short-Term Solar Power
Forecasting Using Natural Gradient Boosting [0.0]
We propose a two stage probabilistic forecasting framework able to generate highly accurate, reliable, and sharp forecasts.
The framework offers full transparency on both the point forecasts and the prediction intervals (PIs)
To highlight the performance and the applicability of the proposed framework, real data from two PV parks located in Southern Germany are employed.
arXiv Detail & Related papers (2021-08-05T12:59:38Z) - Explaining a Series of Models by Propagating Local Feature Attributions [9.66840768820136]
Pipelines involving several machine learning models improve performance in many domains but are difficult to understand.
We introduce a framework to propagate local feature attributions through complex pipelines of models based on a connection to the Shapley value.
Our framework enables us to draw higher-level conclusions based on groups of gene expression features for Alzheimer's and breast cancer histologic grade prediction.
arXiv Detail & Related papers (2021-04-30T22:20:58Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented.
It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts.
Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.