GRANDE: Gradient-Based Decision Tree Ensembles for Tabular Data
- URL: http://arxiv.org/abs/2309.17130v3
- Date: Tue, 12 Mar 2024 08:30:39 GMT
- Title: GRANDE: Gradient-Based Decision Tree Ensembles for Tabular Data
- Authors: Sascha Marton, Stefan L\"udtke, Christian Bartelt, Heiner
Stuckenschmidt
- Abstract summary: We propose a novel approach for learning hard, axis-aligned decision tree ensembles using end-to-end gradient descent.
Grande is based on a dense representation of tree ensembles, which affords to use backpropagation with a straight-through operator.
We demonstrate that our method outperforms existing gradient-boosting and deep learning frameworks on most datasets.
- Score: 9.107782510356989
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the success of deep learning for text and image data, tree-based
ensemble models are still state-of-the-art for machine learning with
heterogeneous tabular data. However, there is a significant need for
tabular-specific gradient-based methods due to their high flexibility. In this
paper, we propose $\text{GRANDE}$, $\text{GRA}$die$\text{N}$t-Based
$\text{D}$ecision Tree $\text{E}$nsembles, a novel approach for learning hard,
axis-aligned decision tree ensembles using end-to-end gradient descent. GRANDE
is based on a dense representation of tree ensembles, which affords to use
backpropagation with a straight-through operator to jointly optimize all model
parameters. Our method combines axis-aligned splits, which is a useful
inductive bias for tabular data, with the flexibility of gradient-based
optimization. Furthermore, we introduce an advanced instance-wise weighting
that facilitates learning representations for both, simple and complex
relations, within a single model. We conducted an extensive evaluation on a
predefined benchmark with 19 classification datasets and demonstrate that our
method outperforms existing gradient-boosting and deep learning frameworks on
most datasets. The method is available under:
https://github.com/s-marton/GRANDE
Related papers
- A Closer Look at Deep Learning on Tabular Data [52.50778536274327]
Tabular data is prevalent across various domains in machine learning.
Deep Neural Network (DNN)-based methods have shown promising performance comparable to tree-based ones.
arXiv Detail & Related papers (2024-07-01T04:24:07Z) - Why do tree-based models still outperform deep learning on tabular data? [0.0]
We show that tree-based models remain state-of-the-art on medium-sized data.
We conduct an empirical investigation into the differing inductive biases of tree-based models and Neural Networks (NNs)
arXiv Detail & Related papers (2022-07-18T08:36:08Z) - A Robust Stacking Framework for Training Deep Graph Models with
Multifaceted Node Features [61.92791503017341]
Graph Neural Networks (GNNs) with numerical node features and graph structure as inputs have demonstrated superior performance on various supervised learning tasks with graph data.
The best models for such data types in most standard supervised learning settings with IID (non-graph) data are not easily incorporated into a GNN.
Here we propose a robust stacking framework that fuses graph-aware propagation with arbitrary models intended for IID data.
arXiv Detail & Related papers (2022-06-16T22:46:33Z) - Condensing Graphs via One-Step Gradient Matching [50.07587238142548]
We propose a one-step gradient matching scheme, which performs gradient matching for only one single step without training the network weights.
Our theoretical analysis shows this strategy can generate synthetic graphs that lead to lower classification loss on real graphs.
In particular, we are able to reduce the dataset size by 90% while approximating up to 98% of the original performance.
arXiv Detail & Related papers (2022-06-15T18:20:01Z) - Convergent Boosted Smoothing for Modeling Graph Data with Tabular Node
Features [46.052312251801]
We propose a framework for iterating boosting with graph propagation steps.
Our approach is anchored in a principled meta loss function.
Across a variety of non-iid graph datasets, our method achieves comparable or superior performance.
arXiv Detail & Related papers (2021-10-26T04:53:12Z) - Learning to Synthesize Data for Semantic Parsing [57.190817162674875]
We propose a generative model which models the composition of programs and maps a program to an utterance.
Due to the simplicity of PCFG and pre-trained BART, our generative model can be efficiently learned from existing data at hand.
We evaluate our method in both in-domain and out-of-domain settings of text-to-Query parsing on the standard benchmarks of GeoQuery and Spider.
arXiv Detail & Related papers (2021-04-12T21:24:02Z) - GRAD-MATCH: A Gradient Matching Based Data Subset Selection for
Efficient Learning [23.75284126177203]
We propose a general framework, GRAD-MATCH, which finds subsets that closely match the gradient of the training or validation set.
We show that GRAD-MATCH significantly and consistently outperforms several recent data-selection algorithms.
arXiv Detail & Related papers (2021-02-27T04:09:32Z) - Deep Reinforcement Learning of Graph Matching [63.469961545293756]
Graph matching (GM) under node and pairwise constraints has been a building block in areas from optimization to computer vision.
We present a reinforcement learning solver for GM i.e. RGM that seeks the node correspondence between pairwise graphs.
Our method differs from the previous deep graph matching model in the sense that they are focused on the front-end feature extraction and affinity function learning.
arXiv Detail & Related papers (2020-12-16T13:48:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.