Related papers: HyperTab: Hypernetwork Approach for Deep Learning on Small Tabular Datasets

HyperTab: Hypernetwork Approach for Deep Learning on Small Tabular Datasets

URL: http://arxiv.org/abs/2304.03543v2
Date: Thu, 24 Aug 2023 08:57:54 GMT
Title: HyperTab: Hypernetwork Approach for Deep Learning on Small Tabular Datasets
Authors: Witold Wydma\'nski, Oleksii Bulenok, Marek \'Smieja
Abstract summary: We introduce HyperTab, a hypernetwork-based approach to solving small sample problems on datasets. By combining the advantages of Random Forests and neural networks, HyperTab generates an ensemble of neural networks. We show that HyperTab consistently outranks other methods on small data and scores comparable to them on larger datasets.
Score: 3.9870413777302027
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning has achieved impressive performance in many domains, such as computer vision and natural language processing, but its advantage over classical shallow methods on tabular datasets remains questionable. It is especially challenging to surpass the performance of tree-like ensembles, such as XGBoost or Random Forests, on small-sized datasets (less than 1k samples). To tackle this challenge, we introduce HyperTab, a hypernetwork-based approach to solving small sample problems on tabular datasets. By combining the advantages of Random Forests and neural networks, HyperTab generates an ensemble of neural networks, where each target model is specialized to process a specific lower-dimensional view of the data. Since each view plays the role of data augmentation, we virtually increase the number of training samples while keeping the number of trainable parameters unchanged, which prevents model overfitting. We evaluated HyperTab on more than 40 tabular datasets of a varying number of samples and domains of origin, and compared its performance with shallow and deep learning models representing the current state-of-the-art. We show that HyperTab consistently outranks other methods on small data (with a statistically significant difference) and scores comparable to them on larger datasets. We make a python package with the code available to download at https://pypi.org/project/hypertab/

Related papers

TabPFN: One Model to Rule Them All? [21.658323618943697]
We provide a tailored explanation of how TabPFN works for a statistics audience.<n>We show that an out-of-the-box application of TabPFN vastly outperforms specialized state-of-the-art methods.
arXiv Detail & Related papers (2025-05-26T13:55:29Z)
Transformers Boost the Performance of Decision Trees on Tabular Data across Sample Sizes [135.68092471784516]
We propose a simple and lightweight approach for fusing large language models and gradient-boosted decision trees. We name our fusion methods LLM-Boost and PFN-Boost, respectively. We demonstrate state-of-the-art performance against numerous baselines and ensembling algorithms.
arXiv Detail & Related papers (2025-02-04T19:30:41Z)
VisTabNet: Adapting Vision Transformers for Tabular Data [7.6400146954285315]
We propose a cross-modal transfer learning method, which allows for adapting Vision Transformer with pre-trained weights to process tabular data. We show VisTabNet's superiority, outperforming both traditional ensemble methods and recent deep learning models. We share our example implementation as a GitHub repository.
arXiv Detail & Related papers (2024-12-28T13:40:46Z)
TabDiff: a Multi-Modal Diffusion Model for Tabular Data Generation [91.50296404732902]
We introduce TabDiff, a joint diffusion framework that models all multi-modal distributions of tabular data in one model. Our key innovation is the development of a joint continuous-time diffusion process for numerical and categorical data. TabDiff achieves superior average performance over existing competitive baselines, with up to $22.5%$ improvement over the state-of-the-art model on pair-wise column correlation estimations.
arXiv Detail & Related papers (2024-10-27T22:58:47Z)
In-Context Data Distillation with TabPFN [11.553950697974825]
In-context data distillation (ICD) is a novel methodology that effectively eliminates these constraints by optimizing TabPFN's context. ICD efficiently enables TabPFN to handle significantly larger datasets with a fixed memory budget, improving TabPFN's quadratic memory complexity but at the cost of a linear number of tuning steps.
arXiv Detail & Related papers (2024-02-10T15:23:45Z)
MambaTab: A Plug-and-Play Model for Learning Tabular Data [13.110156202816112]
This work introduces an innovative approach based on a structured state-space model (SSM), MambaTab, for tabular data. MambaTab delivers superior performance while requiring significantly fewer parameters, as empirically validated on diverse benchmark datasets.
arXiv Detail & Related papers (2024-01-16T22:44:12Z)
MotherNet: A Foundational Hypernetwork for Tabular Classification [1.9643748953805937]
We propose a hypernetwork architecture that we call MotherNet, trained on millions of classification tasks. MotherNet replaces training on specific datasets with in-context learning through a single forward pass. The child network generated by MotherNet using in-context learning outperforms neural networks trained using gradient descent on small datasets.
arXiv Detail & Related papers (2023-12-14T01:48:58Z)
Training-Free Generalization on Heterogeneous Tabular Data via Meta-Representation [67.30538142519067]
We propose Tabular data Pre-Training via Meta-representation (TabPTM) A deep neural network is then trained to associate these meta-representations with dataset-specific classification confidences. Experiments validate that TabPTM achieves promising performance in new datasets, even under few-shot scenarios.
arXiv Detail & Related papers (2023-10-31T18:03:54Z)
When Do Neural Nets Outperform Boosted Trees on Tabular Data? [65.30290020731825]
We take a step back and question the importance of the 'NN vs. GBDT' debate. For a surprisingly high number of datasets, the performance difference between GBDTs and NNs is negligible. We analyze dozens of metafeatures to determine what properties of a dataset make NNs or GBDTs better-suited to perform well. Our insights act as a guide for practitioners to determine which techniques may work best on their dataset.
arXiv Detail & Related papers (2023-05-04T17:04:41Z)
STUNT: Few-shot Tabular Learning with Self-generated Tasks from Unlabeled Tables [64.0903766169603]
We propose a framework for few-shot semi-supervised learning, coined Self-generated Tasks from UNlabeled Tables (STUNT) Our key idea is to self-generate diverse few-shot tasks by treating randomly chosen columns as a target label. We then employ a meta-learning scheme to learn generalizable knowledge with the constructed tasks.
arXiv Detail & Related papers (2023-03-02T02:37:54Z)
Why do tree-based models still outperform deep learning on tabular data? [0.0]
We show that tree-based models remain state-of-the-art on medium-sized data. We conduct an empirical investigation into the differing inductive biases of tree-based models and Neural Networks (NNs)
arXiv Detail & Related papers (2022-07-18T08:36:08Z)
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second [48.87527918630822]
We present TabPFN, a trained Transformer that can do supervised classification for small datasets in less than a second. TabPFN performs in-context learning (ICL), it learns to make predictions using sequences of labeled examples. We show that our method clearly outperforms boosted trees and performs on par with complex state-of-the-art AutoML systems with up to 230$times$ speedup.
arXiv Detail & Related papers (2022-07-05T07:17:43Z)
Transfer Learning with Deep Tabular Models [66.67017691983182]
We show that upstream data gives tabular neural networks a decisive advantage over GBDT models. We propose a realistic medical diagnosis benchmark for tabular transfer learning. We propose a pseudo-feature method for cases where the upstream and downstream feature sets differ.
arXiv Detail & Related papers (2022-06-30T14:24:32Z)
Hopular: Modern Hopfield Networks for Tabular Data [5.470026407471584]
We suggest "Hopular", a novel Deep Learning architecture for medium- and small-sized datasets. Hopular uses stored data to identify feature-feature, feature-target, and sample-sample dependencies. In experiments on small-sized datasets with less than 1,000 samples, Hopular surpasses Gradient Boosting, Random Forests, SVMs, and in particular several Deep Learning methods.
arXiv Detail & Related papers (2022-06-01T17:57:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.