Related papers: Retrieval & Fine-Tuning for In-Context Tabular Models

Retrieval & Fine-Tuning for In-Context Tabular Models

URL: http://arxiv.org/abs/2406.05207v1
Date: Fri, 7 Jun 2024 18:43:33 GMT
Title: Retrieval & Fine-Tuning for In-Context Tabular Models
Authors: Valentin Thomas, Junwei Ma, Rasa Hosseinzadeh, Keyvan Golestan, Guangwei Yu, Maksims Volkovs, Anthony Caterini,
Abstract summary: Recent advancements using transformer-based in-context learning have shown promise on smaller and less complex datasets, but have struggled to scale to larger and more complex ones. We propose a combination of retrieval and fine-tuning: we can adapt the transformer to a local subset of the data by collecting nearest neighbours, and then perform task-specific fine-tuning with this retrieved set of neighbours in context. We show a significant boost in performance compared to the base in-context model.
Score: 16.668695961462827
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Tabular data is a pervasive modality spanning a wide range of domains, and the inherent diversity poses a considerable challenge for deep learning. Recent advancements using transformer-based in-context learning have shown promise on smaller and less complex datasets, but have struggled to scale to larger and more complex ones. To address this limitation, we propose a combination of retrieval and fine-tuning: we can adapt the transformer to a local subset of the data by collecting nearest neighbours, and then perform task-specific fine-tuning with this retrieved set of neighbours in context. Using TabPFN as the base model -- currently the best tabular in-context learner -- and applying our retrieval and fine-tuning scheme on top results in what we call a locally-calibrated PFN, or LoCalPFN. We conduct extensive evaluation on 95 datasets curated by TabZilla from OpenML, upon which we establish a new state-of-the-art with LoCalPFN -- even with respect to tuned tree-based models. Notably, we show a significant boost in performance compared to the base in-context model, demonstrating the efficacy of our approach and advancing the frontier of deep learning in tabular data.

Related papers

On Finetuning Tabular Foundation Models [29.76586200178702]
TabPFNv2 claims superior performance over traditional GBDT-based methods on small-scale datasets.<n>We evaluate various finetuning strategies for TabPFNv2 on diverse datasets.<n>We reveal that the success of finetuning stems from the fact that after gradient-based adaptation, the dot products of the query-representations of test objects more accurately reflect their target similarity.
arXiv Detail & Related papers (2025-06-10T16:52:31Z)
A Closer Look at TabPFN v2: Strength, Limitation, and Extension [51.08999772842298]
Tabular Prior-data Fitted Network v2 (TabPFN v2) achieves unprecedented in-context learning accuracy across multiple datasets. In this paper, we evaluate TabPFN v2 on over 300 datasets, confirming its exceptional generalization capabilities on small- to medium-scale tasks.
arXiv Detail & Related papers (2025-02-24T17:38:42Z)
In-context learning of evolving data streams with tabular foundational models [42.13420474990124]
This work bridges advancements from both areas, highlighting how transformers' implicit meta-learning abilities, pre-training on drifting natural data, and reliance on context optimization directly address the core challenges of adaptive learning in dynamic environments. Exploring real-time model adaptation, this research demonstrates that TabPFN, coupled with a simple sliding memory strategy, consistently outperforms ensembles of Hoeffding trees across all non-stationary benchmarks.
arXiv Detail & Related papers (2025-02-24T04:52:35Z)
Transformers Boost the Performance of Decision Trees on Tabular Data across Sample Sizes [135.68092471784516]
We propose a simple and lightweight approach for fusing large language models and gradient-boosted decision trees. We name our fusion methods LLM-Boost and PFN-Boost, respectively. We demonstrate state-of-the-art performance against numerous baselines and ensembling algorithms.
arXiv Detail & Related papers (2025-02-04T19:30:41Z)
A Survey on Deep Tabular Learning [0.0]
Tabular data presents unique challenges for deep learning due to its heterogeneous nature and lack of spatial structure. This survey reviews the evolution of deep learning models for Tabular data, from early fully connected networks (FCNs) to advanced architectures like TabNet, SAINT, TabTranSELU, and MambaNet.
arXiv Detail & Related papers (2024-10-15T20:08:08Z)
A Framework for Fine-Tuning LLMs using Heterogeneous Feedback [69.51729152929413]
We present a framework for fine-tuning large language models (LLMs) using heterogeneous feedback. First, we combine the heterogeneous feedback data into a single supervision format, compatible with methods like SFT and RLHF. Next, given this unified feedback dataset, we extract a high-quality and diverse subset to obtain performance increases.
arXiv Detail & Related papers (2024-08-05T23:20:32Z)
Modern Neighborhood Components Analysis: A Deep Tabular Baseline Two Decades Later [59.88557193062348]
We revisit the classic Neighborhood Component Analysis (NCA), designed to learn a linear projection that captures semantic similarities between instances. We find that minor modifications, such as adjustments to the learning objectives and the integration of deep learning architectures, significantly enhance NCA's performance. We also introduce a neighbor sampling strategy that improves both the efficiency and predictive accuracy of our proposed ModernNCA.
arXiv Detail & Related papers (2024-07-03T16:38:57Z)
Plan, Generate and Complicate: Improving Low-resource Dialogue State Tracking via Easy-to-Difficult Zero-shot Data Augmentation [5.042738414157664]
We propose EDZ-DA, an Easy-to-Difficult Zero-shot Data Augmentation framework for low-resource dialogue state tracking. We also complicate the dialogues based on the domain relation to enhance the model's capability for co-reference slot tracking.
arXiv Detail & Related papers (2024-06-13T06:49:03Z)
Mixture of In-Context Prompters for Tabular PFNs [33.76194735049027]
MIXTUREPFN is the Condorcet winner across 36 diverse datasets against 19 strong deep learning and tree-based baselines. It achieves the highest mean rank among Top-10 aforementioned algorithms with statistical significance.
arXiv Detail & Related papers (2024-05-25T09:47:59Z)
Interpretable Machine Learning for TabPFN [5.012821694203072]
The TabPFN model is able to achieve state-of-the-art performance on a variety of classification tasks. By taking advantage of the unique properties of the model, our adaptations allow for more efficient computations.
arXiv Detail & Related papers (2024-03-16T13:35:15Z)
In-Context Data Distillation with TabPFN [11.553950697974825]
In-context data distillation (ICD) is a novel methodology that effectively eliminates these constraints by optimizing TabPFN's context. ICD efficiently enables TabPFN to handle significantly larger datasets with a fixed memory budget, improving TabPFN's quadratic memory complexity but at the cost of a linear number of tuning steps.
arXiv Detail & Related papers (2024-02-10T15:23:45Z)
Latent Bottlenecked Attentive Neural Processes [71.18817592128207]
We present Latent Bottlenecked Attentive Neural Processes (LBANPs) LBANPs have a querying computational complexity independent of the number of context datapoints. We show LBANPs achieve results competitive with the state-of-the-art on meta-regression, image completion, and contextual multi-armed bandits.
arXiv Detail & Related papers (2022-11-15T19:21:41Z)
Beyond Transfer Learning: Co-finetuning for Action Localisation [64.07196901012153]
We propose co-finetuning -- simultaneously training a single model on multiple upstream'' and downstream'' tasks. We demonstrate that co-finetuning outperforms traditional transfer learning when using the same total amount of data. We also show how we can easily extend our approach to multiple upstream'' datasets to further improve performance.
arXiv Detail & Related papers (2022-07-08T10:25:47Z)
CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance. In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z)
Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations. Our framework well preserves the relations between samples. By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.