Related papers: Attention and Self-Attention in Random Forests

Attention and Self-Attention in Random Forests

URL: http://arxiv.org/abs/2207.04293v1
Date: Sat, 9 Jul 2022 16:15:53 GMT
Title: Attention and Self-Attention in Random Forests
Authors: Lev V. Utkin and Andrei V. Konstantinov
Abstract summary: New models of random forests jointly using the attention and self-attention mechanisms are proposed. The self-attention aims to capture dependencies of the tree predictions and to remove noise or anomalous predictions in the random forest.
Score: 5.482532589225552
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: New models of random forests jointly using the attention and self-attention mechanisms are proposed for solving the regression problem. The models can be regarded as extensions of the attention-based random forest whose idea stems from applying a combination of the Nadaraya-Watson kernel regression and the Huber's contamination model to random forests. The self-attention aims to capture dependencies of the tree predictions and to remove noise or anomalous predictions in the random forest. The self-attention module is trained jointly with the attention module for computing weights. It is shown that the training process of attention weights is reduced to solving a single quadratic or linear optimization problem. Three modifications of the general approach are proposed and compared. A specific multi-head self-attention for the random forest is also considered. Heads of the self-attention are obtained by changing its tuning parameters including the kernel parameters and the contamination parameter of models. Numerical experiments with various datasets illustrate the proposed models and show that the supplement of the self-attention improves the model performance for many datasets.

Related papers

Spatial Reasoning with Denoising Models [49.83744014336816]
We introduce a framework to perform reasoning over sets of continuous variables via denoising generative models. We demonstrate for the first time, that order of generation can successfully be predicted by the denoising network itself.
arXiv Detail & Related papers (2025-02-28T14:08:30Z)
LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient deep ensemble method for self-attention networks. By employing a single pre-trained self-attention network with weights shared across all members, we train member-specific low-rank matrices for the attention projections. Our method exhibits superior calibration compared to explicit ensembles and achieves similar or better accuracy across various prediction tasks and datasets.
arXiv Detail & Related papers (2024-05-23T11:10:32Z)
Ensemble Modeling for Multimodal Visual Action Recognition [50.38638300332429]
We propose an ensemble modeling approach for multimodal action recognition. We independently train individual modality models using a variant of focal loss tailored to handle the long-tailed distribution of the MECCANO [21] dataset.
arXiv Detail & Related papers (2023-08-10T08:43:20Z)
Neural Attention Forests: Transformer-Based Forest Improvement [4.129225533930966]
The main idea behind the proposed NAF model is to introduce the attention mechanism into the random forest. In contrast to the available models like the attention-based random forest, the attention weights and the Nadaraya-Watson regression are represented in the form of neural networks. The combination of the random forest and neural networks implementing the attention mechanism forms a transformer for enhancing the forest predictions.
arXiv Detail & Related papers (2023-04-12T17:01:38Z)
Improved Anomaly Detection by Using the Attention-Based Isolation Forest [4.640835690336653]
Attention-Based Isolation Forest (ABIForest) for solving anomaly detection problem is proposed. The main idea is to assign attention weights to each path of trees with learnable parameters depending on instances and trees themselves. ABIForest can be viewed as the first modification of Isolation Forest, which incorporates the attention mechanism in a simple way without applying gradient-based algorithms.
arXiv Detail & Related papers (2022-10-05T20:58:57Z)
AGBoost: Attention-based Modification of Gradient Boosting Machine [0.0]
A new attention-based model for the gradient boosting machine (GBM) called AGBoost is proposed for solving regression problems. The main idea behind the proposed AGBoost model is to assign attention weights with trainable parameters to iterations of GBM.
arXiv Detail & Related papers (2022-07-12T17:42:20Z)
An Approximation Method for Fitted Random Forests [0.0]
We study methods that approximate each fitted tree in the Random Forests model using the multinomial allocation of the data points to the leafs. Specifically, we begin by studying whether fitting a multinomial logistic regression helps reduce the size while preserving the prediction quality.
arXiv Detail & Related papers (2022-07-05T17:28:52Z)
Attention-based Random Forest and Contamination Model [5.482532589225552]
The main idea behind the proposed ABRF models is to assign attention weights with trainable parameters to decision trees in a specific way. The weights depend on the distance between an instance, which falls into a corresponding leaf of a tree, and instances, which fall in the same leaf.
arXiv Detail & Related papers (2022-01-08T19:35:57Z)
SparseBERT: Rethinking the Importance Analysis in Self-attention [107.68072039537311]
Transformer-based models are popular for natural language processing (NLP) tasks due to its powerful capacity. Attention map visualization of a pre-trained model is one direct method for understanding self-attention mechanism. We propose a Differentiable Attention Mask (DAM) algorithm, which can be also applied in guidance of SparseBERT design.
arXiv Detail & Related papers (2021-02-25T14:13:44Z)
Anomaly Detection of Time Series with Smoothness-Inducing Sequential Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series. Our model parameterizes mean and variance for each time-stamp with flexible neural networks. We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z)
Bayesian Attention Modules [65.52970388117923]
We propose a scalable version of attention that is easy to implement and optimize. Our experiments show the proposed method brings consistent improvements over the corresponding baselines.
arXiv Detail & Related papers (2020-10-20T20:30:55Z)
Censored Quantile Regression Forest [81.9098291337097]
We develop a new estimating equation that adapts to censoring and leads to quantile score whenever the data do not exhibit censoring. The proposed procedure named it censored quantile regression forest, allows us to estimate quantiles of time-to-event without any parametric modeling assumption.
arXiv Detail & Related papers (2020-01-08T23:20:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.