Attention augmented differentiable forest for tabular data
- URL: http://arxiv.org/abs/2010.02921v1
- Date: Fri, 2 Oct 2020 11:42:33 GMT
- Title: Attention augmented differentiable forest for tabular data
- Authors: Yingshi Chen
- Abstract summary: Differentiable forest is ensemble of decision trees with full differentiability.
We propose tree attention block(TAB) in the framework of differentiable forest.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differentiable forest is an ensemble of decision trees with full
differentiability. Its simple tree structure is easy to use and explain. With
full differentiability, it would be trained in the end-to-end learning
framework with gradient-based optimization method. In this paper, we propose
tree attention block(TAB) in the framework of differentiable forest. TAB block
has two operations, squeeze and regulate. The squeeze operation would extract
the characteristic of each tree. The regulate operation would learn nonlinear
relations between these trees. So TAB block would learn the importance of each
tree and adjust its weight to improve accuracy. Our experiment on large tabular
dataset shows attention augmented differentiable forest would get comparable
accuracy with gradient boosted decision trees(GBDT), which is the
state-of-the-art algorithm for tabular datasets. And on some datasets, our
model has higher accuracy than best GBDT libs (LightGBM, Catboost, and
XGBoost). Differentiable forest model supports batch training and batch size is
much smaller than the size of training set. So on larger data sets, its memory
usage is much lower than GBDT model. The source codes are available at
https://github.com/closest-git/QuantumForest.
Related papers
- TreeGrad-Ranker: Feature Ranking via $O(L)$-Time Gradients for Decision Trees [73.0940890296463]
probabilistic values are used to rank features for explaining local predicted values of decision trees.<n>TreeGrad computes the gradients of the multilinear extension of the joint objective in $O(L)$ time for decision trees with $L$ leaves.<n>TreeGrad-Ranker aggregates the gradients while optimizing the joint objective to produce feature rankings.<n>TreeGrad-Shap is a numerically stable algorithm for computing Beta Shapley values with integral parameters.
arXiv Detail & Related papers (2026-02-12T06:17:12Z) - Transformers Boost the Performance of Decision Trees on Tabular Data across Sample Sizes [135.68092471784516]
We propose a simple and lightweight approach for fusing large language models and gradient-boosted decision trees.
We name our fusion methods LLM-Boost and PFN-Boost, respectively.
We demonstrate state-of-the-art performance against numerous baselines and ensembling algorithms.
arXiv Detail & Related papers (2025-02-04T19:30:41Z) - Soft Hoeffding Tree: A Transparent and Differentiable Model on Data Streams [2.6524539020042663]
Stream mining algorithms such as Hoeffding trees grow based on the incoming data stream.
We propose soft Hoeffding trees (SoHoT) as a new differentiable and transparent model for possibly infinite and changing data streams.
arXiv Detail & Related papers (2024-11-07T15:49:53Z) - GRANDE: Gradient-Based Decision Tree Ensembles for Tabular Data [9.107782510356989]
We propose a novel approach for learning hard, axis-aligned decision tree ensembles using end-to-end gradient descent.
Grande is based on a dense representation of tree ensembles, which affords to use backpropagation with a straight-through operator.
We demonstrate that our method outperforms existing gradient-boosting and deep learning frameworks on most datasets.
arXiv Detail & Related papers (2023-09-29T10:49:14Z) - TreeLearn: A Comprehensive Deep Learning Method for Segmenting
Individual Trees from Ground-Based LiDAR Forest Point Clouds [42.87502453001109]
We propose TreeLearn, a deep learning-based approach for tree instance segmentation of forest point clouds.
TreeLearn is trained on already segmented point clouds in a data-driven manner, making it less reliant on predefined features and algorithms.
We trained TreeLearn on forest point clouds of 6665 trees, labeled using the Lidar360 software.
arXiv Detail & Related papers (2023-09-15T15:20:16Z) - Policy Gradient with Tree Expansion [72.10002936187388]
Policy gradient methods are notorious for having a large variance and high sample complexity.<n>We introduce SoftTreeMax -- a generalization of softmax that employs planning.<n>We show that SoftTreeMax reduces the gradient variance by three orders of magnitude.
arXiv Detail & Related papers (2023-01-30T19:03:14Z) - SoftTreeMax: Policy Gradient with Tree Search [72.9513807133171]
We introduce SoftTreeMax, the first approach that integrates tree-search into policy gradient.
On Atari, SoftTreeMax demonstrates up to 5x better performance in faster run-time compared with distributed PPO.
arXiv Detail & Related papers (2022-09-28T09:55:47Z) - Understanding Deep Learning via Decision Boundary [81.49114762506287]
We show that the neural network with lower decision boundary (DB) variability has better generalizability.
Two new notions, algorithm DB variability and $(epsilon, eta)$-data DB variability, are proposed to measure the decision boundary variability.
arXiv Detail & Related papers (2022-06-03T11:34:12Z) - Flexible Modeling and Multitask Learning using Differentiable Tree
Ensembles [6.037383467521294]
We propose a flexible framework for learning tree ensembles to support arbitrary loss functions, missing responses, and multi-task learning.
Our framework builds on differentiable tree ensembles, which can be trained using first-order methods.
We show that our framework can lead to 100x more compact and 23% more expressive tree ensembles than those by popular toolkits.
arXiv Detail & Related papers (2022-05-19T17:30:49Z) - Shrub Ensembles for Online Classification [7.057937612386993]
Decision Tree (DT) ensembles provide excellent performance while adapting to changes in the data, but they are not resource efficient.
We propose a novel memory-efficient online classification ensemble called shrub ensembles for resource-constraint systems.
Our algorithm trains small to medium-sized decision trees on small windows and uses gradient descent to learn the ensemble weights of these shrubs'
arXiv Detail & Related papers (2021-12-07T14:22:43Z) - Active-LATHE: An Active Learning Algorithm for Boosting the Error
Exponent for Learning Homogeneous Ising Trees [75.93186954061943]
We design and analyze an algorithm that boosts the error exponent by at least 40% when $rho$ is at least $0.8$.
Our analysis hinges on judiciously exploiting the minute but detectable statistical variation of the samples to allocate more data to parts of the graph.
arXiv Detail & Related papers (2021-10-27T10:45:21Z) - Growing Deep Forests Efficiently with Soft Routing and Learned
Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects.
We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions.
Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z) - An Efficient Adversarial Attack for Tree Ensembles [91.05779257472675]
adversarial attacks on tree based ensembles such as gradient boosting decision trees (DTs) and random forests (RFs)
We show that our method can be thousands of times faster than the previous mixed-integer linear programming (MILP) based approach.
Our code is available at https://chong-z/tree-ensemble-attack.
arXiv Detail & Related papers (2020-10-22T10:59:49Z) - Deep differentiable forest with sparse attention for the tabular data [0.0]
The differentiable forest has the advantages of both trees and neural networks.
It has full differentiability and all variables are learnable parameters.
We find and analyze the attention mechanism in the differentiable forest.
arXiv Detail & Related papers (2020-02-29T09:47:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.