Team up GBDTs and DNNs: Advancing Efficient and Effective Tabular Prediction with Tree-hybrid MLPs
- URL: http://arxiv.org/abs/2407.09790v1
- Date: Sat, 13 Jul 2024 07:13:32 GMT
- Title: Team up GBDTs and DNNs: Advancing Efficient and Effective Tabular Prediction with Tree-hybrid MLPs
- Authors: Jiahuan Yan, Jintai Chen, Qianxing Wang, Danny Z. Chen, Jian Wu,
- Abstract summary: Tabular datasets play a crucial role in various applications.
Two prominent model types, Boosted Decision Trees (GBDTs) and Deep Neural Networks (DNNs), have demonstrated performance advantages on distinct prediction tasks.
This paper proposes a new framework that amalgamates the advantages of both GBDTs and DNNs, resulting in a DNN algorithm that is as efficient as GBDTs and is competitively effective regardless of dataset preferences.
- Score: 20.67800392863432
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tabular datasets play a crucial role in various applications. Thus, developing efficient, effective, and widely compatible prediction algorithms for tabular data is important. Currently, two prominent model types, Gradient Boosted Decision Trees (GBDTs) and Deep Neural Networks (DNNs), have demonstrated performance advantages on distinct tabular prediction tasks. However, selecting an effective model for a specific tabular dataset is challenging, often demanding time-consuming hyperparameter tuning. To address this model selection dilemma, this paper proposes a new framework that amalgamates the advantages of both GBDTs and DNNs, resulting in a DNN algorithm that is as efficient as GBDTs and is competitively effective regardless of dataset preferences for GBDTs or DNNs. Our idea is rooted in an observation that deep learning (DL) offers a larger parameter space that can represent a well-performing GBDT model, yet the current back-propagation optimizer struggles to efficiently discover such optimal functionality. On the other hand, during GBDT development, hard tree pruning, entropy-driven feature gate, and model ensemble have proved to be more adaptable to tabular data. By combining these key components, we present a Tree-hybrid simple MLP (T-MLP). In our framework, a tensorized, rapidly trained GBDT feature gate, a DNN architecture pruning approach, as well as a vanilla back-propagation optimizer collaboratively train a randomly initialized MLP model. Comprehensive experiments show that T-MLP is competitive with extensively tuned DNNs and GBDTs in their dominating tabular benchmarks (88 datasets) respectively, all achieved with compact model storage and significantly reduced training duration.
Related papers
- In-Context Data Distillation with TabPFN [11.553950697974825]
In-context data distillation (ICD) is a novel methodology that effectively eliminates these constraints by optimizing TabPFN's context.
ICD efficiently enables TabPFN to handle significantly larger datasets with a fixed memory budget, improving TabPFN's quadratic memory complexity but at the cost of a linear number of tuning steps.
arXiv Detail & Related papers (2024-02-10T15:23:45Z) - Coevolutionary Algorithm for Building Robust Decision Trees under
Minimax Regret [12.72963164625931]
This paper proposes a novel coevolutionary algorithm (CoEvoRDT) designed to create robust algorithms-decision trees (DTs)
Motivated by the limitations of traditional DT algorithms, we leverage adaptive coevolution to allow DTs to evolve and learn from interactions with perturbed input data.
CoEvoRDT is tested on 20 popular datasets and shows superior performance compared to 4 state-of-the-art algorithms.
arXiv Detail & Related papers (2023-12-14T16:12:22Z) - REST: Enhancing Group Robustness in DNNs through Reweighted Sparse
Training [49.581884130880944]
Deep neural network (DNN) has been proven effective in various domains.
However, they often struggle to perform well on certain minority groups during inference.
arXiv Detail & Related papers (2023-12-05T16:27:54Z) - SPARE: A Single-Pass Neural Model for Relational Databases [36.55513135391452]
We propose SPARE, a new class of neural models that can be trained efficiently on RDBs while providing similar accuracies as GNNs.
For enabling efficient training, different from GNNs, SPARE makes use of the fact that data in RDBs has a predictive regular structure, which allows one to train these models in a single pass while exploiting symmetries at the same time.
arXiv Detail & Related papers (2023-10-20T15:23:17Z) - When Do Neural Nets Outperform Boosted Trees on Tabular Data? [65.30290020731825]
We take a step back and question the importance of the 'NN vs. GBDT' debate.
For a surprisingly high number of datasets, the performance difference between GBDTs and NNs is negligible.
We analyze dozens of metafeatures to determine what properties of a dataset make NNs or GBDTs better-suited to perform well.
Our insights act as a guide for practitioners to determine which techniques may work best on their dataset.
arXiv Detail & Related papers (2023-05-04T17:04:41Z) - Towards Robust k-Nearest-Neighbor Machine Translation [72.9252395037097]
k-Nearest-Neighbor Machine Translation (kNN-MT) becomes an important research direction of NMT in recent years.
Its main idea is to retrieve useful key-value pairs from an additional datastore to modify translations without updating the NMT model.
The underlying retrieved noisy pairs will dramatically deteriorate the model performance.
We propose a confidence-enhanced kNN-MT model with robust training to alleviate the impact of noise.
arXiv Detail & Related papers (2022-10-17T07:43:39Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language
Models [152.29364079385635]
As pre-trained models grow bigger, the fine-tuning process can be time-consuming and computationally expensive.
We propose a framework for resource- and parameter-efficient fine-tuning by leveraging the sparsity prior in both weight updates and the final model weights.
Our proposed framework, dubbed Dually Sparsity-Embedded Efficient Tuning (DSEE), aims to achieve two key objectives: (i) parameter efficient fine-tuning and (ii) resource-efficient inference.
arXiv Detail & Related papers (2021-10-30T03:29:47Z) - Enhancing Transformers with Gradient Boosted Decision Trees for NLI
Fine-Tuning [7.906608953906889]
We introduce FreeGBDT, a method of fitting a GBDT head on the features computed during fine-tuning to increase performance without additional computation by the neural network.
We demonstrate the effectiveness of our method on several NLI datasets using a strong baseline model.
arXiv Detail & Related papers (2021-05-08T22:31:51Z) - Deep Time Delay Neural Network for Speech Enhancement with Full Data
Learning [60.20150317299749]
This paper proposes a deep time delay neural network (TDNN) for speech enhancement with full data learning.
To make full use of the training data, we propose a full data learning method for speech enhancement.
arXiv Detail & Related papers (2020-11-11T06:32:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.