Related papers: AGBoost: Attention-based Modification of Gradient Boosting Machine

AGBoost: Attention-based Modification of Gradient Boosting Machine

URL: http://arxiv.org/abs/2207.05724v1
Date: Tue, 12 Jul 2022 17:42:20 GMT
Title: AGBoost: Attention-based Modification of Gradient Boosting Machine
Authors: Andrei Konstantinov and Lev Utkin and Stanislav Kirpichenko
Abstract summary: A new attention-based model for the gradient boosting machine (GBM) called AGBoost is proposed for solving regression problems. The main idea behind the proposed AGBoost model is to assign attention weights with trainable parameters to iterations of GBM.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A new attention-based model for the gradient boosting machine (GBM) called AGBoost (the attention-based gradient boosting) is proposed for solving regression problems. The main idea behind the proposed AGBoost model is to assign attention weights with trainable parameters to iterations of GBM under condition that decision trees are base learners in GBM. Attention weights are determined by applying properties of decision trees and by using the Huber's contamination model which provides an interesting linear dependence between trainable parameters of the attention and the attention weights. This peculiarity allows us to train the attention weights by solving the standard quadratic optimization problem with linear constraints. The attention weights also depend on the discount factor as a tuning parameter, which determines how much the impact of the weight is decreased with the number of iterations. Numerical experiments performed for two types of base learners, original decision trees and extremely randomized trees with various regression datasets illustrate the proposed model.

Related papers

Unifying Perplexing Behaviors in Modified BP Attributions through Alignment Perspective [61.5509267439999]
We present a unified theoretical framework for methods like GBP, RectGrad, LRP, and DTD. We demonstrate that they achieve input alignment by combining the weights of activated neurons. This alignment improves the visualization quality and reduces sensitivity to weight randomization.
arXiv Detail & Related papers (2025-03-14T07:58:26Z)
Forecasting with Hyper-Trees [50.72190208487953]
Hyper-Trees are designed to learn the parameters of time series models. By relating the parameters of a target time series model to features, Hyper-Trees also address the issue of parameter non-stationarity. In this novel approach, the trees first generate informative representations from the input features, which a shallow network then maps to the target model parameters.
arXiv Detail & Related papers (2024-05-13T15:22:15Z)
Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms [88.74308282658133]
Reization (RP) Policy Gradient Methods (PGMs) have been widely adopted for continuous control tasks in robotics and computer graphics. Recent studies have revealed that, when applied to long-term reinforcement learning problems, model-based RP PGMs may experience chaotic and non-smooth optimization landscapes. We propose a spectral normalization method to mitigate the exploding variance issue caused by long model unrolls.
arXiv Detail & Related papers (2023-10-30T18:43:21Z)
Improved Anomaly Detection by Using the Attention-Based Isolation Forest [4.640835690336653]
Attention-Based Isolation Forest (ABIForest) for solving anomaly detection problem is proposed. The main idea is to assign attention weights to each path of trees with learnable parameters depending on instances and trees themselves. ABIForest can be viewed as the first modification of Isolation Forest, which incorporates the attention mechanism in a simple way without applying gradient-based algorithms.
arXiv Detail & Related papers (2022-10-05T20:58:57Z)
Attention and Self-Attention in Random Forests [5.482532589225552]
New models of random forests jointly using the attention and self-attention mechanisms are proposed. The self-attention aims to capture dependencies of the tree predictions and to remove noise or anomalous predictions in the random forest.
arXiv Detail & Related papers (2022-07-09T16:15:53Z)
Training Discrete Deep Generative Models via Gapped Straight-Through Estimator [72.71398034617607]
We propose a Gapped Straight-Through ( GST) estimator to reduce the variance without incurring resampling overhead. This estimator is inspired by the essential properties of Straight-Through Gumbel-Softmax. Experiments demonstrate that the proposed GST estimator enjoys better performance compared to strong baselines on two discrete deep generative modeling tasks.
arXiv Detail & Related papers (2022-06-15T01:46:05Z)
Attention-based Random Forest and Contamination Model [5.482532589225552]
The main idea behind the proposed ABRF models is to assign attention weights with trainable parameters to decision trees in a specific way. The weights depend on the distance between an instance, which falls into a corresponding leaf of a tree, and instances, which fall in the same leaf.
arXiv Detail & Related papers (2022-01-08T19:35:57Z)
A Variational Inference Approach to Inverse Problems with Gamma Hyperpriors [60.489902135153415]
This paper introduces a variational iterative alternating scheme for hierarchical inverse problems with gamma hyperpriors. The proposed variational inference approach yields accurate reconstruction, provides meaningful uncertainty quantification, and is easy to implement.
arXiv Detail & Related papers (2021-11-26T06:33:29Z)
A cautionary tale on fitting decision trees to data from additive models: generalization lower bounds [9.546094657606178]
We study the generalization performance of decision trees with respect to different generative regression models. This allows us to elicit their inductive bias, that is, the assumptions the algorithms make (or do not make) to generalize to new data. We prove a sharp squared error generalization lower bound for a large class of decision tree algorithms fitted to sparse additive models.
arXiv Detail & Related papers (2021-10-18T21:22:40Z)
Weight Vector Tuning and Asymptotic Analysis of Binary Linear Classifiers [82.5915112474988]
This paper proposes weight vector tuning of a generic binary linear classifier through the parameterization of a decomposition of the discriminant by a scalar. It is also found that weight vector tuning significantly improves the performance of Linear Discriminant Analysis (LDA) under high estimation noise.
arXiv Detail & Related papers (2021-10-01T17:50:46Z)
Variational Causal Networks: Approximate Bayesian Inference over Causal Structures [132.74509389517203]
We introduce a parametric variational family modelled by an autoregressive distribution over the space of discrete DAGs. In experiments, we demonstrate that the proposed variational posterior is able to provide a good approximation of the true posterior.
arXiv Detail & Related papers (2021-06-14T17:52:49Z)
Selective Cascade of Residual ExtraTrees [3.6575928994425735]
We propose a novel tree-based ensemble method named Selective Cascade of Residual ExtraTrees (SCORE) SCORE draws inspiration from representation learning, incorporates regularized regression with variable selection features, and utilizes boosting to improve prediction and reduce generalization errors. Our computer experiments show that SCORE provides comparable or superior performance in prediction against ExtraTrees, random forest, gradient boosting machine, and neural networks.
arXiv Detail & Related papers (2020-09-29T16:31:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.