Related papers: TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks

TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks

URL: http://arxiv.org/abs/2402.11137v3
Date: Mon, 21 Oct 2024 16:48:06 GMT
Title: TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks
Authors: Benjamin Feuer, Robin Tibor Schirrmeister, Valeriia Cherepanova, Chinmay Hegde, Frank Hutter, Micah Goldblum, Niv Cohen, Colin White,
Abstract summary: Prior-data fitted networks (PFNs) make use of pretraining and in-context learning to achieve strong performance on new tasks in a single forward pass. We introduce TuneTables, a parameter-efficient fine-tuning strategy for PFNs that compresses large datasets into a smaller learned context. We show that TuneTables can be used as an interpretability tool and can even be used to mitigate biases by optimizing a fairness objective.
Score: 90.00817095558094
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While tabular classification has traditionally relied on from-scratch training, a recent breakthrough called prior-data fitted networks (PFNs) challenges this approach. Similar to large language models, PFNs make use of pretraining and in-context learning to achieve strong performance on new tasks in a single forward pass. However, current PFNs have limitations that prohibit their widespread adoption. Notably, TabPFN achieves very strong performance on small tabular datasets but is not designed to make predictions for datasets of size larger than 1000. In this work, we overcome these limitations and substantially improve the performance of PFNs via context optimization. We introduce TuneTables, a parameter-efficient fine-tuning strategy for PFNs that compresses large datasets into a smaller learned context. We conduct extensive experiments on 19 algorithms over 98 datasets and find that TuneTables achieves the best performance on average, outperforming boosted trees such as CatBoost, while optimizing fewer than 5% of TabPFN's parameters. Furthermore, we show that TuneTables can be used as an interpretability tool and can even be used to mitigate biases by optimizing a fairness objective. We open-source our code and raw results at https://github.com/penfever/TuneTables.

Related papers

Real-TabPFN: Improving Tabular Foundation Models via Continued Pre-training With Real-World Data [38.08600450054975]
We show that this performance can be significantly boosted by a targeted continued pre-training phase.<n>We demonstrate that leveraging a small, curated collection of large, real-world datasets for continued pre-training yields superior predictive downstream accuracy.<n>Our resulting model, Real-TabPFN, achieves substantial performance gains on 29 datasets from the OpenML AutoML Benchmark.
arXiv Detail & Related papers (2025-07-05T09:39:07Z)
Prior-Fitted Networks Scale to Larger Datasets When Treated as Weak Learners [82.72552644267724]
BoostPFN can outperform standard PFNs with the same size of training samples in large datasets. High performance is maintained for up to 50x of the pre-training size of PFNs.
arXiv Detail & Related papers (2025-03-03T07:31:40Z)
Transformers Boost the Performance of Decision Trees on Tabular Data across Sample Sizes [135.68092471784516]
We propose a simple and lightweight approach for fusing large language models and gradient-boosted decision trees. We name our fusion methods LLM-Boost and PFN-Boost, respectively. We demonstrate state-of-the-art performance against numerous baselines and ensembling algorithms.
arXiv Detail & Related papers (2025-02-04T19:30:41Z)
TabPFN Unleashed: A Scalable and Effective Solution to Tabular Classification Problems [30.597696775364447]
TabPFN has emerged as a promising in-context learning model. It is capable of directly predicting the labels of test samples given labeled training examples. It has demonstrated competitive performance, particularly on small-scale classification tasks.
arXiv Detail & Related papers (2025-02-04T17:49:44Z)
Tokenize features, enhancing tables: the FT-TABPFN model for tabular classification [13.481699494376809]
FT-TabPFN is an enhanced version of TabPFN that includes a novel Feature Tokenization layer to better handle classification features. Our full source code is available for community use and development.
arXiv Detail & Related papers (2024-06-11T02:13:46Z)
Retrieval & Fine-Tuning for In-Context Tabular Models [16.668695961462827]
Recent advancements using transformer-based in-context learning have shown promise on smaller and less complex datasets, but have struggled to scale to larger and more complex ones. We propose a combination of retrieval and fine-tuning: we can adapt the transformer to a local subset of the data by collecting nearest neighbours, and then perform task-specific fine-tuning with this retrieved set of neighbours in context. We show a significant boost in performance compared to the base in-context model.
arXiv Detail & Related papers (2024-06-07T18:43:33Z)
Interpretable Machine Learning for TabPFN [5.012821694203072]
The TabPFN model is able to achieve state-of-the-art performance on a variety of classification tasks. By taking advantage of the unique properties of the model, our adaptations allow for more efficient computations.
arXiv Detail & Related papers (2024-03-16T13:35:15Z)
In-Context Data Distillation with TabPFN [11.553950697974825]
In-context data distillation (ICD) is a novel methodology that effectively eliminates these constraints by optimizing TabPFN's context. ICD efficiently enables TabPFN to handle significantly larger datasets with a fixed memory budget, improving TabPFN's quadratic memory complexity but at the cost of a linear number of tuning steps.
arXiv Detail & Related papers (2024-02-10T15:23:45Z)
When Do Neural Nets Outperform Boosted Trees on Tabular Data? [65.30290020731825]
We take a step back and question the importance of the 'NN vs. GBDT' debate. For a surprisingly high number of datasets, the performance difference between GBDTs and NNs is negligible. We analyze dozens of metafeatures to determine what properties of a dataset make NNs or GBDTs better-suited to perform well. Our insights act as a guide for practitioners to determine which techniques may work best on their dataset.
arXiv Detail & Related papers (2023-05-04T17:04:41Z)
Prior-mean-assisted Bayesian optimization application on FRIB Front-End tunning [61.78406085010957]
We exploit a neural network model trained over historical data as a prior mean of BO for FRIB Front-End tuning. In this paper, we exploit a neural network model trained over historical data as a prior mean of BO for FRIB Front-End tuning.
arXiv Detail & Related papers (2022-11-11T18:34:15Z)
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second [48.87527918630822]
We present TabPFN, a trained Transformer that can do supervised classification for small datasets in less than a second. TabPFN performs in-context learning (ICL), it learns to make predictions using sequences of labeled examples. We show that our method clearly outperforms boosted trees and performs on par with complex state-of-the-art AutoML systems with up to 230$times$ speedup.
arXiv Detail & Related papers (2022-07-05T07:17:43Z)
Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning [81.3514358542452]
Few-shot in-context learning (ICL) incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made. parameter-efficient fine-tuning offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task. In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
arXiv Detail & Related papers (2022-05-11T17:10:41Z)
Adversarial Filters of Dataset Biases [96.090959788952]
Large neural models have demonstrated human-level performance on language and vision benchmarks. Their performance degrades considerably on adversarial or out-of-distribution samples. We propose AFLite, which adversarially filters such dataset biases.
arXiv Detail & Related papers (2020-02-10T21:59:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.