Related papers: Efficient & Correct Predictive Equivalence for Decision Trees

Efficient & Correct Predictive Equivalence for Decision Trees

URL: http://arxiv.org/abs/2509.17774v4
Date: Thu, 16 Oct 2025 16:22:56 GMT
Title: Efficient & Correct Predictive Equivalence for Decision Trees
Authors: Joao Marques-Silva, Alexey Ignatiev,
Abstract summary: The Rashomon set of decision trees (DTs) finds importance uses.<n>DTs computing the same classification function, i.e. predictive equivalent DTs, can represent a significant fraction of the Rashomon set.<n>This paper shows that there exist decision trees that trigger the worst-case exponential running time and space of the Quine-McCluskey (QM) method.
Score: 7.615032824973038
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Rashomon set of decision trees (DTs) finds importance uses. Recent work showed that DTs computing the same classification function, i.e. predictive equivalent DTs, can represent a significant fraction of the Rashomon set. Such redundancy is undesirable. For example, feature importance based on the Rashomon set becomes inaccurate due the existence of predictive equivalent DTs, i.e. DTs with the same prediction for every possible input. In recent work, McTavish et al. proposed solutions for several computational problems related with DTs, including that of deciding predictive equivalent DTs. The approach of McTavish et al. consists of applying the well-known method of Quine-McCluskey (QM) for obtaining minimum-size DNF (disjunctive normal form) representations of DTs, which are then used for comparing DTs for predictive equivalence. Furthermore, the minimum-size DNF representation was also applied to computing explanations for the predictions made by DTs, and to finding predictions in the presence of missing data. However, the problem of formula minimization is hard for the second level of the polynomial hierarchy, and the QM method may exhibit worst-case exponential running time and space. This paper first demonstrates that there exist decision trees that trigger the worst-case exponential running time and space of the QM method. Second, the paper shows that the QM method may incorrectly decide predictive equivalence, if two key constraints are not respected, and one may be difficult to formally guarantee. Third, the paper shows that any of the problems to which the smallest DNF representation has been applied to can be solved in polynomial time, in the size of the DT. The experiments confirm that, for DTs for which the worst-case of the QM method is triggered, the algorithms proposed in this paper are orders of magnitude faster than the ones proposed by McTavish et al.

Related papers

Latent Schrodinger Bridge: Prompting Latent Diffusion for Fast Unpaired Image-to-Image Translation [58.19676004192321]
Diffusion models (DMs), which enable both image generation from noise and inversion from data, have inspired powerful unpaired image-to-image (I2I) translation algorithms. We tackle this problem with Schrodinger Bridges (SBs), which are differential equations (SDEs) between distributions with minimal transport cost. Inspired by this observation, we propose Latent Schrodinger Bridges (LSBs) that approximate the SB ODE via pre-trained Stable Diffusion. We demonstrate that our algorithm successfully conduct competitive I2I translation in unsupervised setting with only a fraction of cost required by previous DM-
arXiv Detail & Related papers (2024-11-22T11:24:14Z)
Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods [59.779795063072655]
Chain-of-Thought (CoT) prompting and its variants have gained popularity as effective methods for solving multi-step reasoning problems. We analyze CoT prompting from a statistical estimation perspective, providing a comprehensive characterization of its sample complexity.
arXiv Detail & Related papers (2024-08-25T04:07:18Z)
Multi-Source and Test-Time Domain Adaptation on Multivariate Signals using Spatio-Temporal Monge Alignment [59.75420353684495]
Machine learning applications on signals such as computer vision or biomedical data often face challenges due to the variability that exists across hardware devices or session recordings. In this work, we propose Spatio-Temporal Monge Alignment (STMA) to mitigate these variabilities. We show that STMA leads to significant and consistent performance gains between datasets acquired with very different settings.
arXiv Detail & Related papers (2024-07-19T13:33:38Z)
An Eager Satisfiability Modulo Theories Solver for Algebraic Datatypes [4.393684402895834]
Algebraic data types (ADTs) are a construct classically found in functional programming languages. We present an SMT solver that takes a fundamentally different approach, an empheager approach.
arXiv Detail & Related papers (2023-10-18T18:14:18Z)
Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval [139.21955930418815]
Cross-modal Retrieval methods build similarity relations between vision and language modalities by jointly learning a common representation space. However, the predictions are often unreliable due to the Aleatoric uncertainty, which is induced by low-quality data, e.g., corrupt images, fast-paced videos, and non-detailed texts. We propose a novel Prototype-based Aleatoric Uncertainty Quantification (PAU) framework to provide trustworthy predictions by quantifying the uncertainty arisen from the inherent data ambiguity.
arXiv Detail & Related papers (2023-09-29T09:41:19Z)
AdjointDPM: Adjoint Sensitivity Method for Gradient Backpropagation of Diffusion Probabilistic Models [103.41269503488546]
Existing customization methods require access to multiple reference examples to align pre-trained diffusion probabilistic models with user-provided concepts. This paper aims to address the challenge of DPM customization when the only available supervision is a differentiable metric defined on the generated contents. We propose a novel method AdjointDPM, which first generates new samples from diffusion models by solving the corresponding probability-flow ODEs. It then uses the adjoint sensitivity method to backpropagate the gradients of the loss to the models' parameters.
arXiv Detail & Related papers (2023-07-20T09:06:21Z)
DisDiff: Unsupervised Disentanglement of Diffusion Probabilistic Models [42.58375679841317]
We propose a new task, disentanglement of Diffusion Probabilistic Models (DPMs) The task is to automatically discover the inherent factors behind the observations and disentangle the gradient fields of DPM into sub-gradient fields. We devise an unsupervised approach named DisDiff, achieving disentangled representation learning in the framework of DPMs.
arXiv Detail & Related papers (2023-01-31T15:58:32Z)
Provably Precise, Succinct and Efficient Explanations for Decision Trees [32.062312674333775]
Decision trees (DTs) embody interpretable classifiers. Work has demonstrated that predictions in DTs ought to be explained with rigorous explanations. delta-relevant sets denote that are succinct and provably precise.
arXiv Detail & Related papers (2022-05-19T13:54:52Z)
A Novel Splitting Criterion Inspired by Geometric Mean Metric Learning for Decision Tree [27.253204680627952]
Decision tree (DT) attracts persistent research attention due to its impressive empirical performance and interpretability in numerous applications. In this paper, we newly design a splitting criterion to speed up the growth. The criterion is induced from Geometric Mean Metric Learning (GMML) and then optimized under its diagonalized metric matrix constraint.
arXiv Detail & Related papers (2022-04-23T07:33:57Z)
Nonparametric estimation of continuous DPPs with kernel methods [0.0]
Parametric and nonparametric inference methods have been proposed in the finite case, i.e. when the point patterns live in a finite ground set. We show that a restricted version of this maximum likelihood (MLE) problem falls within the scope of a recent representer theorem for nonnegative functions in an RKHS. We propose, analyze, and demonstrate a fixed point algorithm to solve this finite-dimensional problem.
arXiv Detail & Related papers (2021-06-27T11:57:14Z)
LSDAT: Low-Rank and Sparse Decomposition for Decision-based Adversarial Attack [74.5144793386864]
LSDAT crafts perturbations in the low-dimensional subspace formed by the sparse component of the input sample and that of an adversarial sample. LSD works directly in the image pixel domain to guarantee that non-$ell$ constraints, such as sparsity, are satisfied.
arXiv Detail & Related papers (2021-03-19T13:10:47Z)
Risk Aware Optimization of Water Sensor Placement [0.0]
We propose a data structure (sort oftemporal heatmap) collecting simulation for every sensor. We identify indicators for detecting problem-specific converge issues. Results on a benchmark and a real-world network are presented.
arXiv Detail & Related papers (2021-03-08T16:12:02Z)
A Unified Joint Maximum Mean Discrepancy for Domain Adaptation [73.44809425486767]
This paper theoretically derives a unified form of JMMD that is easy to optimize. From the revealed unified JMMD, we illustrate that JMMD degrades the feature-label dependence that benefits to classification. We propose a novel MMD matrix to promote the dependence, and devise a novel label kernel that is robust to label distribution shift.
arXiv Detail & Related papers (2021-01-25T09:46:14Z)
On Explaining Decision Trees [17.646704122091087]
Decision trees (DTs) epitomize what have become to be known as interpretable machine learning (ML) models. This paper shows that in some settings DTs can hardly be deemed interpretable, with paths in a DT being arbitrarily larger than a PI-explanation.
arXiv Detail & Related papers (2020-10-21T14:33:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.