Related papers: TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second

TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second

URL: http://arxiv.org/abs/2207.01848v6
Date: Sat, 16 Sep 2023 09:33:32 GMT
Title: TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second
Authors: Noah Hollmann, Samuel M\"uller, Katharina Eggensperger, Frank Hutter
Abstract summary: We present TabPFN, a trained Transformer that can do supervised classification for small datasets in less than a second. TabPFN performs in-context learning (ICL), it learns to make predictions using sequences of labeled examples. We show that our method clearly outperforms boosted trees and performs on par with complex state-of-the-art AutoML systems with up to 230$times$ speedup.
Score: 48.87527918630822
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present TabPFN, a trained Transformer that can do supervised classification for small tabular datasets in less than a second, needs no hyperparameter tuning and is competitive with state-of-the-art classification methods. TabPFN performs in-context learning (ICL), it learns to make predictions using sequences of labeled examples (x, f(x)) given in the input, without requiring further parameter updates. TabPFN is fully entailed in the weights of our network, which accepts training and test samples as a set-valued input and yields predictions for the entire test set in a single forward pass. TabPFN is a Prior-Data Fitted Network (PFN) and is trained offline once, to approximate Bayesian inference on synthetic datasets drawn from our prior. This prior incorporates ideas from causal reasoning: It entails a large space of structural causal models with a preference for simple structures. On the 18 datasets in the OpenML-CC18 suite that contain up to 1 000 training data points, up to 100 purely numerical features without missing values, and up to 10 classes, we show that our method clearly outperforms boosted trees and performs on par with complex state-of-the-art AutoML systems with up to 230$\times$ speedup. This increases to a 5 700$\times$ speedup when using a GPU. We also validate these results on an additional 67 small numerical datasets from OpenML. We provide all our code, the trained TabPFN, an interactive browser demo and a Colab notebook at https://github.com/automl/TabPFN.

Related papers

TabPFN: One Model to Rule Them All? [21.658323618943697]
We provide a tailored explanation of how TabPFN works for a statistics audience.<n>We show that an out-of-the-box application of TabPFN vastly outperforms specialized state-of-the-art methods.
arXiv Detail & Related papers (2025-05-26T13:55:29Z)
Prior-Fitted Networks Scale to Larger Datasets When Treated as Weak Learners [82.72552644267724]
BoostPFN can outperform standard PFNs with the same size of training samples in large datasets. High performance is maintained for up to 50x of the pre-training size of PFNs.
arXiv Detail & Related papers (2025-03-03T07:31:40Z)
TabICL: A Tabular Foundation Model for In-Context Learning on Large Data [15.08819125687632]
We introduce TabICL, a foundation model for classification pretrained on synthetic datasets with up to 60K samples. It is on par with TabPFNv2 while being systematically faster (up to 10 times) and significantly outperforms all other approaches. On 56 datasets with over 10K samples, TabICL surpasses both TabPFNv2 and CatBoost, demonstrating the potential of ICL for large data.
arXiv Detail & Related papers (2025-02-08T13:25:04Z)
Transformers Boost the Performance of Decision Trees on Tabular Data across Sample Sizes [135.68092471784516]
We propose a simple and lightweight approach for fusing large language models and gradient-boosted decision trees. We name our fusion methods LLM-Boost and PFN-Boost, respectively. We demonstrate state-of-the-art performance against numerous baselines and ensembling algorithms.
arXiv Detail & Related papers (2025-02-04T19:30:41Z)
Tokenize features, enhancing tables: the FT-TABPFN model for tabular classification [13.481699494376809]
FT-TabPFN is an enhanced version of TabPFN that includes a novel Feature Tokenization layer to better handle classification features. Our full source code is available for community use and development.
arXiv Detail & Related papers (2024-06-11T02:13:46Z)
Why In-Context Learning Transformers are Tabular Data Classifiers [22.33649426762373]
We show that ICL-transformers acquire the ability to create complex decision boundaries during pretraining. We create TabForestPFN, the ICL-transformer pretrained on both the original TabPFN synthetic dataset generator and our forest dataset generator.
arXiv Detail & Related papers (2024-05-22T07:13:55Z)
Interpretable Machine Learning for TabPFN [5.012821694203072]
The TabPFN model is able to achieve state-of-the-art performance on a variety of classification tasks. By taking advantage of the unique properties of the model, our adaptations allow for more efficient computations.
arXiv Detail & Related papers (2024-03-16T13:35:15Z)
TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks [90.00817095558094]
Prior-data fitted networks (PFNs) make use of pretraining and in-context learning to achieve strong performance on new tasks in a single forward pass. We introduce TuneTables, a parameter-efficient fine-tuning strategy for PFNs that compresses large datasets into a smaller learned context. We show that TuneTables can be used as an interpretability tool and can even be used to mitigate biases by optimizing a fairness objective.
arXiv Detail & Related papers (2024-02-17T00:02:23Z)
In-Context Data Distillation with TabPFN [11.553950697974825]
In-context data distillation (ICD) is a novel methodology that effectively eliminates these constraints by optimizing TabPFN's context. ICD efficiently enables TabPFN to handle significantly larger datasets with a fixed memory budget, improving TabPFN's quadratic memory complexity but at the cost of a linear number of tuning steps.
arXiv Detail & Related papers (2024-02-10T15:23:45Z)
MotherNet: A Foundational Hypernetwork for Tabular Classification [1.9643748953805937]
We propose a hypernetwork architecture that we call MotherNet, trained on millions of classification tasks. MotherNet replaces training on specific datasets with in-context learning through a single forward pass. The child network generated by MotherNet using in-context learning outperforms neural networks trained using gradient descent on small datasets.
arXiv Detail & Related papers (2023-12-14T01:48:58Z)
Training-Free Generalization on Heterogeneous Tabular Data via Meta-Representation [67.30538142519067]
We propose Tabular data Pre-Training via Meta-representation (TabPTM) A deep neural network is then trained to associate these meta-representations with dataset-specific classification confidences. Experiments validate that TabPTM achieves promising performance in new datasets, even under few-shot scenarios.
arXiv Detail & Related papers (2023-10-31T18:03:54Z)
Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs) PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors. We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z)
Benchmarking Multimodal AutoML for Tabular Data with Text Fields [83.43249184357053]
We assemble 18 multimodal data tables that each contain some text fields. Our benchmark enables researchers to evaluate their own methods for supervised learning with numeric, categorical, and text features.
arXiv Detail & Related papers (2021-11-04T09:29:16Z)
Fast, Accurate, and Simple Models for Tabular Data via Augmented Distillation [97.42894942391575]
We propose FAST-DAD to distill arbitrarily complex ensemble predictors into individual models like boosted trees, random forests, and deep networks. Our individual distilled models are over 10x faster and more accurate than ensemble predictors produced by AutoML tools like H2O/AutoSklearn.
arXiv Detail & Related papers (2020-06-25T09:57:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.