Related papers: Neural Attention Forests: Transformer-Based Forest Improvement

Neural Attention Forests: Transformer-Based Forest Improvement

URL: http://arxiv.org/abs/2304.05980v1
Date: Wed, 12 Apr 2023 17:01:38 GMT
Title: Neural Attention Forests: Transformer-Based Forest Improvement
Authors: Andrei V. Konstantinov, Lev V. Utkin, Alexey A. Lukashin, Vladimir A. Muliukha
Abstract summary: The main idea behind the proposed NAF model is to introduce the attention mechanism into the random forest. In contrast to the available models like the attention-based random forest, the attention weights and the Nadaraya-Watson regression are represented in the form of neural networks. The combination of the random forest and neural networks implementing the attention mechanism forms a transformer for enhancing the forest predictions.
Score: 4.129225533930966
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A new approach called NAF (the Neural Attention Forest) for solving regression and classification tasks under tabular training data is proposed. The main idea behind the proposed NAF model is to introduce the attention mechanism into the random forest by assigning attention weights calculated by neural networks of a specific form to data in leaves of decision trees and to the random forest itself in the framework of the Nadaraya-Watson kernel regression. In contrast to the available models like the attention-based random forest, the attention weights and the Nadaraya-Watson regression are represented in the form of neural networks whose weights can be regarded as trainable parameters. The first part of neural networks with shared weights is trained for all trees and computes attention weights of data in leaves. The second part aggregates outputs of the tree networks and aims to minimize the difference between the random forest prediction and the truth target value from a training set. The neural network is trained in an end-to-end manner. The combination of the random forest and neural networks implementing the attention mechanism forms a transformer for enhancing the forest predictions. Numerical experiments with real datasets illustrate the proposed method. The code implementing the approach is publicly available.

Related papers

Opening the Black Box: predicting the trainability of deep neural networks with reconstruction entropy [0.0]
We present a method for predicting the trainable regime in parameter space for deep feedforward neural networks (DNNs) We show that a single epoch of training is sufficient to predict the trainability of the deep feedforward network on a range of datasets.
arXiv Detail & Related papers (2024-06-13T18:00:05Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
SurvBeNIM: The Beran-Based Neural Importance Model for Explaining the Survival Models [2.4861619769660637]
A new method called the Survival Beran-based Neural Importance Model (SurvBeNIM) is proposed. It aims to explain predictions of machine learning survival models, which are in the form of survival or cumulative hazard functions.
arXiv Detail & Related papers (2023-12-11T18:54:26Z)
An Initialization Schema for Neuronal Networks on Tabular Data [0.9155684383461983]
We show that a binomial neural network can be used effectively on tabular data. The proposed approach shows a simple but effective approach for initializing the first hidden layer in neural networks. We evaluate our approach on multiple public datasets and showcase the improved performance compared to other neural network-based approaches.
arXiv Detail & Related papers (2023-11-07T13:52:35Z)
Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs. We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD. We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z)
Attention and Self-Attention in Random Forests [5.482532589225552]
New models of random forests jointly using the attention and self-attention mechanisms are proposed. The self-attention aims to capture dependencies of the tree predictions and to remove noise or anomalous predictions in the random forest.
arXiv Detail & Related papers (2022-07-09T16:15:53Z)
Neural Maximum A Posteriori Estimation on Unpaired Data for Motion Deblurring [87.97330195531029]
We propose a Neural Maximum A Posteriori (NeurMAP) estimation framework for training neural networks to recover blind motion information and sharp content from unpaired data. The proposed NeurMAP is an approach to existing deblurring neural networks, and is the first framework that enables training image deblurring networks on unpaired datasets.
arXiv Detail & Related papers (2022-04-26T08:09:47Z)
Attention-based Random Forest and Contamination Model [5.482532589225552]
The main idea behind the proposed ABRF models is to assign attention weights with trainable parameters to decision trees in a specific way. The weights depend on the distance between an instance, which falls into a corresponding leaf of a tree, and instances, which fall in the same leaf.
arXiv Detail & Related papers (2022-01-08T19:35:57Z)
Probing Predictions on OOD Images via Nearest Categories [97.055916832257]
We study out-of-distribution (OOD) prediction behavior of neural networks when they classify images from unseen classes or corrupted images. We introduce a new measure, nearest category generalization (NCG), where we compute the fraction of OOD inputs that are classified with the same label as their nearest neighbor in the training set. We find that robust networks have consistently higher NCG accuracy than natural training, even when the OOD data is much farther away than the robustness radius.
arXiv Detail & Related papers (2020-11-17T07:42:27Z)
Neural Networks with Recurrent Generative Feedback [61.90658210112138]
We instantiate this design on convolutional neural networks (CNNs) In the experiments, CNN-F shows considerably improved adversarial robustness over conventional feedforward CNNs on standard benchmarks.
arXiv Detail & Related papers (2020-07-17T19:32:48Z)
Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network. PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z)
Neural Random Forest Imitation [24.02961053662835]
We introduce an imitation learning approach by generating training data from a random forest and learning a neural network that imitates its behavior. This implicit transformation creates very efficient neural networks that learn the decision boundaries of a random forest. Experiments on several real-world benchmark datasets demonstrate superior performance.
arXiv Detail & Related papers (2019-11-25T11:04:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.