Exact Functional ANOVA Decomposition for Categorical Inputs Models
- URL: http://arxiv.org/abs/2603.02673v1
- Date: Tue, 03 Mar 2026 06:59:56 GMT
- Title: Exact Functional ANOVA Decomposition for Categorical Inputs Models
- Authors: Baptiste Ferrere, Nicolas Bousquet, Fabrice Gamboa, Jean-Michel Loubes, Joseph Muré,
- Abstract summary: ANOVA offers a principled framework for interpretability by decomposing a model's prediction into main effects and higher-order interactions.<n>For independent features, this decomposition is well-defined, strongly linked with SHAP values, and serves as a cornerstone of additive explainability.<n>We completely resolve this limitation for categorical inputs.
- Score: 2.762021507766656
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Functional ANOVA offers a principled framework for interpretability by decomposing a model's prediction into main effects and higher-order interactions. For independent features, this decomposition is well-defined, strongly linked with SHAP values, and serves as a cornerstone of additive explainability. However, the lack of an explicit closed-form expression for general dependent distributions has forced practitioners to rely on costly sampling-based approximations. We completely resolve this limitation for categorical inputs. By bridging functional analysis with the extension of discrete Fourier analysis, we derive a closed-form decomposition without any assumption. Our formulation is computationally very efficient. It seamlessly recovers the classical independent case and extends to arbitrary dependence structures, including distributions with non-rectangular support. Furthermore, leveraging the intrinsic link between SHAP and ANOVA under independence, our framework yields a natural generalization of SHAP values for the general categorical setting.
Related papers
- Partition Function Estimation under Bounded f-Divergence [18.170877027040362]
We study the statistical complexity of estimating partition functions given sample access to a proposal distribution.<n>Our results unify and generalize prior analyses of importance sampling, rejection, sampling, and heavy-tailed mean estimation.
arXiv Detail & Related papers (2026-02-26T22:34:36Z) - The Procrustean Bed of Time Series: The Optimization Bias of Point-wise Loss [53.542743390809356]
This paper aims to provide a first-principles analysis of the Expectation of Optimization Bias (EOB)<n>Our analysis reveals a fundamental paradigm paradox: the more deterministic and structured the time series, the more severe the bias by point-wise loss function.<n>We present a concrete solution that simultaneously achieves both principles via DFT or DWT.
arXiv Detail & Related papers (2025-12-21T06:08:22Z) - Multivariate Bernoulli Hoeffding Decomposition: From Theory to Sensitivity Analysis [2.762021507766656]
This work focuses on the case of Bernoulli inputs and provides a complete analytical characterization of the decomposition.<n>We show that, in this discrete setting, the associated subspaces are one-dimensional and that the decomposition admits a closed-form representation.<n>The paper concludes with perspectives on extending the methodology to high-dimensional settings and to models involving inputs with finite, non-binary support.
arXiv Detail & Related papers (2025-10-08T14:46:20Z) - Loss-Complexity Landscape and Model Structure Functions [53.92822954974537]
We develop a framework for dualizing the Kolmogorov structure function $h_x(alpha)$.<n>We establish a mathematical analogy between information-theoretic constructs and statistical mechanics.<n>We explicitly prove the Legendre-Fenchel duality between the structure function and free energy.
arXiv Detail & Related papers (2025-07-17T21:31:45Z) - Effect Decomposition of Functional-Output Computer Experiments via Orthogonal Additive Gaussian Processes [8.723426955657347]
Functional ANOVA (FANOVA) is a widely used variance-based sensitivity analysis tool.<n>This study proposes a functional-output orthogonal additive Gaussian process (FOAGP) to efficiently perform the data-driven orthogonal effect decomposition.<n>The FOAGP framework also provides analytical formulations for local Sobol' indices and expected conditional variance sensitivity indices.
arXiv Detail & Related papers (2025-06-15T03:24:55Z) - Hoeffding decomposition of black-box models with dependent inputs [30.076357972854723]
We generalize Hoeffding's decomposition for dependent inputs under mild conditions.
We show that any square-integrable, real-valued function of random elements respecting two assumptions can be uniquely additively and offer a characterization.
arXiv Detail & Related papers (2023-10-10T12:28:53Z) - Break The Spell Of Total Correlation In betaTCVAE [4.38301148531795]
This paper proposes a new iterative decomposition path of total correlation and explains the disentangled representation ability of VAE.
The novel model enables VAE to adjust the parameter capacity to divide dependent and independent data features flexibly.
arXiv Detail & Related papers (2022-10-17T07:16:53Z) - Data-Driven Influence Functions for Optimization-Based Causal Inference [105.5385525290466]
We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing.
We study the case where probability distributions are not known a priori but need to be estimated from data.
arXiv Detail & Related papers (2022-08-29T16:16:22Z) - GroupifyVAE: from Group-based Definition to VAE-based Unsupervised
Representation Disentanglement [91.9003001845855]
VAE-based unsupervised disentanglement can not be achieved without introducing other inductive bias.
We address VAE-based unsupervised disentanglement by leveraging the constraints derived from the Group Theory based definition as the non-probabilistic inductive bias.
We train 1800 models covering the most prominent VAE-based models on five datasets to verify the effectiveness of our method.
arXiv Detail & Related papers (2021-02-20T09:49:51Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z) - Distributional Robustness and Regularization in Reinforcement Learning [62.23012916708608]
We introduce a new regularizer for empirical value functions and show that it lower bounds the Wasserstein distributionally robust value function.
It suggests using regularization as a practical tool for dealing with $textitexternal uncertainty$ in reinforcement learning.
arXiv Detail & Related papers (2020-03-05T19:56:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.