Demystifying Functional Random Forests: Novel Explainability Tools for Model Transparency in High-Dimensional Spaces
- URL: http://arxiv.org/abs/2408.12288v1
- Date: Thu, 22 Aug 2024 10:52:32 GMT
- Title: Demystifying Functional Random Forests: Novel Explainability Tools for Model Transparency in High-Dimensional Spaces
- Authors: Fabrizio Maturo, Annamaria Porreca,
- Abstract summary: This paper introduces a novel suite of explainability tools to illuminate the inner mechanisms of Functional Random Forests (FRF)
These tools collectively enhance the transparency of FRF models by providing a detailed analysis of how individual FPCs contribute to model predictions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The advent of big data has raised significant challenges in analysing high-dimensional datasets across various domains such as medicine, ecology, and economics. Functional Data Analysis (FDA) has proven to be a robust framework for addressing these challenges, enabling the transformation of high-dimensional data into functional forms that capture intricate temporal and spatial patterns. However, despite advancements in functional classification methods and very high performance demonstrated by combining FDA and ensemble methods, a critical gap persists in the literature concerning the transparency and interpretability of black-box models, e.g. Functional Random Forests (FRF). In response to this need, this paper introduces a novel suite of explainability tools to illuminate the inner mechanisms of FRF. We propose using Functional Partial Dependence Plots (FPDPs), Functional Principal Component (FPC) Probability Heatmaps, various model-specific and model-agnostic FPCs' importance metrics, and the FPC Internal-External Importance and Explained Variance Bubble Plot. These tools collectively enhance the transparency of FRF models by providing a detailed analysis of how individual FPCs contribute to model predictions. By applying these methods to an ECG dataset, we demonstrate the effectiveness of these tools in revealing critical patterns and improving the explainability of FRF.
Related papers
- Conditional Feature Importance with Generative Modeling Using Adversarial Random Forests [1.0208529247755187]
In explainable artificial intelligence (XAI), conditional feature importance assesses the impact of a feature on a prediction model's performance.
Recent advancements in generative modeling can facilitate measuring conditional feature importance.
This paper proposes cARFi, a method for measuring conditional feature importance through feature values sampled from ARF-estimated conditional distributions.
arXiv Detail & Related papers (2025-01-19T21:34:54Z) - Influence Functions for Scalable Data Attribution in Diffusion Models [52.92223039302037]
Diffusion models have led to significant advancements in generative modelling.
Yet their widespread adoption poses challenges regarding data attribution and interpretability.
We develop an influence functions framework to address these challenges.
arXiv Detail & Related papers (2024-10-17T17:59:02Z) - Random Survival Forest for Censored Functional Data [0.0]
This paper introduces a Random Survival Forest (RSF) method for functional data.
The focus is specifically on defining a new functional data structure, the Censored Functional Data (CFD)
This approach allows for precise modelling of functional survival trajectories, leading to improved interpretation and prediction of survival dynamics across different groups.
arXiv Detail & Related papers (2024-07-22T02:54:06Z) - Prospector Heads: Generalized Feature Attribution for Large Models & Data [82.02696069543454]
We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods.
We demonstrate how prospector heads enable improved interpretation and discovery of class-specific patterns in input data.
arXiv Detail & Related papers (2024-02-18T23:01:28Z) - Cumulative Distribution Function based General Temporal Point Processes [49.758080415846884]
CuFun model represents a novel approach to TPPs that revolves around the Cumulative Distribution Function (CDF)
Our approach addresses several critical issues inherent in traditional TPP modeling.
Our contributions encompass the introduction of a pioneering CDF-based TPP model, the development of a methodology for incorporating past event information into future event prediction.
arXiv Detail & Related papers (2024-02-01T07:21:30Z) - Directed Cyclic Graph for Causal Discovery from Multivariate Functional
Data [15.26007975367927]
We introduce a functional linear structural equation model for causal structure learning.
To enhance interpretability, our model involves a low-dimensional causal embedded space.
We prove that the proposed model is causally identifiable under standard assumptions.
arXiv Detail & Related papers (2023-10-31T15:19:24Z) - Enhancing Interpretability and Generalizability in Extended Isolation Forests [5.139809663513828]
Extended Isolation Forest Feature Importance (ExIFFI) is a method that explains predictions made by Extended Isolation Forest (EIF) models.
EIF+ is designed to enhance the model's ability to detect unseen anomalies through a revised splitting strategy.
ExIFFI outperforms other unsupervised interpretability methods on 8 of 11 real-world datasets.
arXiv Detail & Related papers (2023-10-09T07:24:04Z) - BCDAG: An R package for Bayesian structure and Causal learning of
Gaussian DAGs [77.34726150561087]
We introduce the R package for causal discovery and causal effect estimation from observational data.
Our implementation scales efficiently with the number of observations and, whenever the DAGs are sufficiently sparse, the number of variables in the dataset.
We then illustrate the main functions and algorithms on both real and simulated datasets.
arXiv Detail & Related papers (2022-01-28T09:30:32Z) - Transforming Feature Space to Interpret Machine Learning Models [91.62936410696409]
This contribution proposes a novel approach that interprets machine-learning models through the lens of feature space transformations.
It can be used to enhance unconditional as well as conditional post-hoc diagnostic tools.
A case study on remote-sensing landcover classification with 46 features is used to demonstrate the potential of the proposed approach.
arXiv Detail & Related papers (2021-04-09T10:48:11Z) - Explaining Neural Network Predictions for Functional Data Using
Principal Component Analysis and Feature Importance [0.0]
We propose a procedure for explaining machine learning models fit using functional data.
We demonstrate the technique by explaining neural networks fit to explosion optical spectral-temporal signatures.
arXiv Detail & Related papers (2020-10-15T22:33:21Z) - Estimating Structural Target Functions using Machine Learning and
Influence Functions [103.47897241856603]
We propose a new framework for statistical machine learning of target functions arising as identifiable functionals from statistical models.
This framework is problem- and model-agnostic and can be used to estimate a broad variety of target parameters of interest in applied statistics.
We put particular focus on so-called coarsening at random/doubly robust problems with partially unobserved information.
arXiv Detail & Related papers (2020-08-14T16:48:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.