Model family selection for classification using Neural Decision Trees
- URL: http://arxiv.org/abs/2006.11458v1
- Date: Sat, 20 Jun 2020 01:27:01 GMT
- Title: Model family selection for classification using Neural Decision Trees
- Authors: Anthea M\'erida Montes de Oca, Argyris Kalogeratos, Mathilde Mougeot
- Abstract summary: In this paper we propose a method to reduce the scope of exploration needed for the task.
The idea is to quantify how much it would be necessary to depart from trained instances of a given family, reference models (RMs) carrying rigid' decision boundaries.
- Score: 4.286327408435937
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model selection consists in comparing several candidate models according to a
metric to be optimized. The process often involves a grid search, or such, and
cross-validation, which can be time consuming, as well as not providing much
information about the dataset itself. In this paper we propose a method to
reduce the scope of exploration needed for the task. The idea is to quantify
how much it would be necessary to depart from trained instances of a given
family, reference models (RMs) carrying `rigid' decision boundaries (e.g.
decision trees), so as to obtain an equivalent or better model. In our
approach, this is realized by progressively relaxing the decision boundaries of
the initial decision trees (the RMs) as long as this is beneficial in terms of
performance measured on an analyzed dataset. More specifically, this relaxation
is performed by making use of a neural decision tree, which is a neural network
built from DTs. The final model produced by our method carries non-linear
decision boundaries. Measuring the performance of the final model, and its
agreement to its seeding RM can help the user to figure out on which family of
models he should focus on.
Related papers
- Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference [55.150117654242706]
We show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU.
As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty.
arXiv Detail & Related papers (2024-11-01T21:11:48Z) - Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning [78.72226641279863]
Sparse Mixture of Expert (SMoE) models have emerged as a scalable alternative to dense models in language modeling.
Our research explores task-specific model pruning to inform decisions about designing SMoE architectures.
We introduce an adaptive task-aware pruning technique UNCURL to reduce the number of experts per MoE layer in an offline manner post-training.
arXiv Detail & Related papers (2024-09-02T22:35:03Z) - Learning accurate and interpretable decision trees [27.203303726977616]
We develop approaches to design decision tree learning algorithms given repeated access to data from the same domain.
We study the sample complexity of tuning prior parameters in Bayesian decision tree learning, and extend our results to decision tree regression.
We also study the interpretability of the learned decision trees and introduce a data-driven approach for optimizing the explainability versus accuracy trade-off using decision trees.
arXiv Detail & Related papers (2024-05-24T20:10:10Z) - Modeling Boundedly Rational Agents with Latent Inference Budgets [56.24971011281947]
We introduce a latent inference budget model (L-IBM) that models agents' computational constraints explicitly.
L-IBMs make it possible to learn agent models using data from diverse populations of suboptimal actors.
We show that L-IBMs match or outperform Boltzmann models of decision-making under uncertainty.
arXiv Detail & Related papers (2023-12-07T03:55:51Z) - Reinforcement Learning for Node Selection in Branch-and-Bound [52.2648997215667]
Current state-of-the-art selectors utilize either hand-crafted ensembles that automatically switch between naive sub-node selectors, or learned node selectors that rely on individual node data.
We propose a novel simulation technique that uses reinforcement learning (RL) while considering the entire tree state, rather than just isolated nodes.
arXiv Detail & Related papers (2023-09-29T19:55:56Z) - Improving Group Lasso for high-dimensional categorical data [0.90238471756546]
Group Lasso is a well known efficient algorithm for selection continuous or categorical variables.
We propose a two-step procedure to obtain a sparse solution of the Group Lasso.
We show that our method performs better than the state of the art algorithms with respect to the prediction accuracy or model dimension.
arXiv Detail & Related papers (2022-10-25T13:43:57Z) - To tree or not to tree? Assessing the impact of smoothing the decision
boundaries [4.286327408435937]
We quantify how much should the 'rigid' decision boundaries, produced by an algorithm that naturally finds such solutions, be relaxed to obtain a performance improvement.
We show how these two measures can help the user in figuring out how expressive his model should be, before exploring it further via model selection.
arXiv Detail & Related papers (2022-10-07T16:27:13Z) - An Approximation Method for Fitted Random Forests [0.0]
We study methods that approximate each fitted tree in the Random Forests model using the multinomial allocation of the data points to the leafs.
Specifically, we begin by studying whether fitting a multinomial logistic regression helps reduce the size while preserving the prediction quality.
arXiv Detail & Related papers (2022-07-05T17:28:52Z) - Optimal Decision Diagrams for Classification [68.72078059880018]
We study the training of optimal decision diagrams from a mathematical programming perspective.
We introduce a novel mixed-integer linear programming model for training.
We show how this model can be easily extended for fairness, parsimony, and stability notions.
arXiv Detail & Related papers (2022-05-28T18:31:23Z) - An exact counterfactual-example-based approach to tree-ensemble models
interpretability [0.0]
High-performance models do not exhibit the necessary transparency to make their decisions fully understandable.
We could derive an exact geometrical characterisation of their decision regions under the form of a collection of multidimensional intervals.
An adaptation to reasoning on regression problems is also envisaged.
arXiv Detail & Related papers (2021-05-31T09:32:46Z) - A Twin Neural Model for Uplift [59.38563723706796]
Uplift is a particular case of conditional treatment effect modeling.
We propose a new loss function defined by leveraging a connection with the Bayesian interpretation of the relative risk.
We show our proposed method is competitive with the state-of-the-art in simulation setting and on real data from large scale randomized experiments.
arXiv Detail & Related papers (2021-05-11T16:02:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.