Related papers: Optimal Decision Diagrams for Classification

Optimal Decision Diagrams for Classification

URL: http://arxiv.org/abs/2205.14500v1
Date: Sat, 28 May 2022 18:31:23 GMT
Title: Optimal Decision Diagrams for Classification
Authors: Alexandre M. Florio, Pedro Martins, Maximilian Schiffer, Thiago Serra, Thibaut Vidal
Abstract summary: We study the training of optimal decision diagrams from a mathematical programming perspective. We introduce a novel mixed-integer linear programming model for training. We show how this model can be easily extended for fairness, parsimony, and stability notions.
Score: 68.72078059880018
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Decision diagrams for classification have some notable advantages over decision trees, as their internal connections can be determined at training time and their width is not bound to grow exponentially with their depth. Accordingly, decision diagrams are usually less prone to data fragmentation in internal nodes. However, the inherent complexity of training these classifiers acted as a long-standing barrier to their widespread adoption. In this context, we study the training of optimal decision diagrams (ODDs) from a mathematical programming perspective. We introduce a novel mixed-integer linear programming model for training and demonstrate its applicability for many datasets of practical importance. Further, we show how this model can be easily extended for fairness, parsimony, and stability notions. We present numerical analyses showing that our model allows training ODDs in short computational times, and that ODDs achieve better accuracy than optimal decision trees, while allowing for improved stability without significant accuracy losses.

Related papers

Data Uniformity Improves Training Efficiency and More, with a Convergence Framework Beyond the NTK Regime [9.749891245059596]
We demonstrate that selecting more uniformly distributed data can improve training efficiency while enhancing performance.<n>Specifically, we establish that more uniform (less biased) distribution leads to a larger minimum pairwise distance between data points.<n>We theoretically show that the approximation error of neural networks decreases as $h_min$ increases.
arXiv Detail & Related papers (2025-06-30T17:58:30Z)
Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference [55.150117654242706]
We show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU. As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty.
arXiv Detail & Related papers (2024-11-01T21:11:48Z)
Interpretable Modeling of Deep Reinforcement Learning Driven Scheduling [3.890533943135602]
We present a framework called IRL (Interpretable Reinforcement Learning) to address the issue of interpretability of DRL scheduling. ILR is capable of converting a black-box DNN policy into an interpretable rulebased decision tree while maintaining comparable scheduling performance.
arXiv Detail & Related papers (2024-03-24T20:56:16Z)
Breiman meets Bellman: Non-Greedy Decision Trees with MDPs [8.530182510074983]
We present Dynamic Programming Decision Trees (DPDT), a framework that bridges the gap between greedy and optimal approaches.<n>DPDT relies on a Markov Decision Process formulation combined with split generation to construct near-optimal decision trees.<n>Our empirical study shows that DPDT achieves near-optimal loss with orders of magnitude fewer operations than existing optimal solvers.
arXiv Detail & Related papers (2023-09-22T08:18:08Z)
Robust Learning with Progressive Data Expansion Against Spurious Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features. Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process. We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z)
Optimal Interpretability-Performance Trade-off of Classification Trees with Black-Box Reinforcement Learning [0.0]
Interpretability of AI models allows for user safety checks to build trust in these models. Decision trees (DTs) provide a global view on the learned model and clearly outlines the role of the features that are critical to classify a given data. To learn compact trees, a Reinforcement Learning framework has been recently proposed to explore the space of DTs.
arXiv Detail & Related papers (2023-04-11T09:43:23Z)
Analyzing the Performance of Deep Encoder-Decoder Networks as Surrogates for a Diffusion Equation [0.0]
We study the use of encoder-decoder convolutional neural network (CNN) as surrogates for steady-state diffusion solvers. Our results indicate that increasing the size of the training set has a substantial effect on reducing performance fluctuations and overall error.
arXiv Detail & Related papers (2023-02-07T22:53:19Z)
Data-Driven Offline Decision-Making via Invariant Representation Learning [97.49309949598505]
offline data-driven decision-making involves synthesizing optimized decisions with no active interaction. A key challenge is distributional shift: when we optimize with respect to the input into a model trained from offline data, it is easy to produce an out-of-distribution (OOD) input that appears erroneously good. In this paper, we formulate offline data-driven decision-making as domain adaptation, where the goal is to make accurate predictions for the value of optimized decisions.
arXiv Detail & Related papers (2022-11-21T11:01:37Z)
Scaling Laws Beyond Backpropagation [64.0476282000118]
We study the ability of Direct Feedback Alignment to train causal decoder-only Transformers efficiently. We find that DFA fails to offer more efficient scaling than backpropagation.
arXiv Detail & Related papers (2022-10-26T10:09:14Z)
Holistic Deep Learning [3.718942345103135]
This paper presents a novel holistic deep learning framework that addresses the challenges of vulnerability to input perturbations, overparametrization, and performance instability. The proposed framework holistically improves accuracy, robustness, sparsity, and stability over standard deep learning models.
arXiv Detail & Related papers (2021-10-29T14:46:32Z)
Distributionally Robust Semi-Supervised Learning Over Graphs [68.29280230284712]
Semi-supervised learning (SSL) over graph-structured data emerges in many network science applications. To efficiently manage learning over graphs, variants of graph neural networks (GNNs) have been developed recently. Despite their success in practice, most of existing methods are unable to handle graphs with uncertain nodal attributes. Challenges also arise due to distributional uncertainties associated with data acquired by noisy measurements. A distributionally robust learning framework is developed, where the objective is to train models that exhibit quantifiable robustness against perturbations.
arXiv Detail & Related papers (2021-10-20T14:23:54Z)
SONG: Self-Organizing Neural Graphs [10.253870280561609]
Decision trees are easy to interpret since they are based on binary decisions, they can make decisions faster, and they provide a hierarchy of classes. One of the well-known drawbacks of decision trees, as compared to decision graphs, is that decision trees cannot reuse the decision nodes. In this paper, we provide a general paradigm based on Markov processes, which allows for efficient training of the special type of decision graphs, which we call Self-Organizing Neural Graphs (SONG)
arXiv Detail & Related papers (2021-07-28T07:53:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.