Related papers: TabICLv2: A better, faster, scalable, and open tabular foundation model

TabICLv2: A better, faster, scalable, and open tabular foundation model

URL: http://arxiv.org/abs/2602.11139v1
Date: Wed, 11 Feb 2026 18:51:02 GMT
Title: TabICLv2: A better, faster, scalable, and open tabular foundation model
Authors: Jingang Qu, David Holzmüller, Gaël Varoquaux, Marine Le Morvan,
Abstract summary: We introduce TabICLv2, a new state-of-the-art foundation model for regression and classification built on three pillars.<n>Tabiclv2 generalizes effectively to million-scale datasets under 50GB GPU memory while being markedly faster than RealTabPFN-2.5.
Score: 18.594859017648346
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Tabular foundation models, such as TabPFNv2 and TabICL, have recently dethroned gradient-boosted trees at the top of predictive benchmarks, demonstrating the value of in-context learning for tabular data. We introduce TabICLv2, a new state-of-the-art foundation model for regression and classification built on three pillars: (1) a novel synthetic data generation engine designed for high pretraining diversity; (2) various architectural innovations, including a new scalable softmax in attention improving generalization to larger datasets without prohibitive long-sequence pretraining; and (3) optimized pretraining protocols, notably replacing AdamW with the Muon optimizer. On the TabArena and TALENT benchmarks, TabICLv2 without any tuning surpasses the performance of the current state of the art, RealTabPFN-2.5 (hyperparameter-tuned, ensembled, and fine-tuned on real data). With only moderate pretraining compute, TabICLv2 generalizes effectively to million-scale datasets under 50GB GPU memory while being markedly faster than RealTabPFN-2.5. We provide extensive ablation studies to quantify these contributions and commit to open research by first releasing inference code and model weights at https://github.com/soda-inria/tabicl, with synthetic data engine and pretraining code to follow.

Related papers

End-to-End Compression for Tabular Foundation Models [20.50130399990578]
We propose TACO, an end-to-end compression model that compresses the training dataset in a latent space.<n>We test our method on the TabArena benchmark, where our proposed method is up to 94x faster in inference time, while consuming up to 97% less memory.
arXiv Detail & Related papers (2026-02-05T13:33:58Z)
TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models [76.52858476275865]
TabPFN-2.5 is built for datasets with up 50,000 data points and 2,000 features.<n>It substantially outperforming tuned tree-based models and matching the accuracy of AutoGluon 1.4.<n>For production use cases, we introduce a new distillation engine that converts TabPFN-2.5 into a compact or tree ensemble.
arXiv Detail & Related papers (2025-11-11T18:57:15Z)
nanoTabPFN: A Lightweight and Educational Reimplementation of TabPFN [78.62756717376563]
nanoTabPFN is a lightweight implementation of the TabPFN v2 architecture and a corresponding training loop.<n>It achieves a performance comparable to traditional machine learning baselines within one minute of pre-training on a single GPU.
arXiv Detail & Related papers (2025-11-05T16:52:51Z)
Mitra: Mixed Synthetic Priors for Enhancing Tabular Foundation Models [85.64873567417396]
We introduce Mitra, a TFM trained on a curated mixture of synthetic priors selected for their diversity, distinctiveness, and performance on real-world data.<n>Mitra consistently outperforms state-of-the-art TFMs, such as TabPFNv2 and TabICL, across both classification and regression benchmarks.
arXiv Detail & Related papers (2025-10-24T07:15:06Z)
Chunked TabPFN: Exact Training-Free In-Context Learning for Long-Context Tabular Data [2.2682391370097794]
We introduce a tiled-block strategy to compute attention within the TabPFN framework.<n>This design is compatible with standard GPU setups.<n>We demonstrate the effectiveness of our approach on the standard TabArena benchmark.
arXiv Detail & Related papers (2025-08-30T02:57:01Z)
On Finetuning Tabular Foundation Models [29.76586200178702]
TabPFNv2 claims superior performance over traditional GBDT-based methods on small-scale datasets.<n>We evaluate various finetuning strategies for TabPFNv2 on diverse datasets.<n>We reveal that the success of finetuning stems from the fact that after gradient-based adaptation, the dot products of the query-representations of test objects more accurately reflect their target similarity.
arXiv Detail & Related papers (2025-06-10T16:52:31Z)
A Closer Look at TabPFN v2: Understanding Its Strengths and Extending Its Capabilities [51.08999772842298]
Tabular Prior-data Fitted Network v2 (TabPFN v2) achieves unprecedented in-context learning performance across diverse downstream datasets.<n>We show that TabPFN v2 can infer attribute relationships even when provided with randomized attribute token inputs.<n>We demonstrate that TabPFN v2's limitations can be addressed through a test-time divide-and-context strategy.
arXiv Detail & Related papers (2025-02-24T17:38:42Z)
TabDPT: Scaling Tabular Foundation Models on Real Data [20.00390825519329]
We propose an approach to combine ICL-based retrieval with self supervised learning to train foundation models.<n>We show that incorporating real data during the pre-training phase can lead to significantly faster training and better generalization to unseen data.<n>Our resulting model, TabDPT, achieves top performance on both regression (CTR23) and classification (CC18) benchmarks.
arXiv Detail & Related papers (2024-10-23T18:00:00Z)
Revisiting Nearest Neighbor for Tabular Data: A Deep Tabular Baseline Two Decades Later [76.66498833720411]
We introduce a differentiable version of $K$-nearest neighbors (KNN) originally designed to learn a linear projection to capture semantic similarities between instances.<n>Surprisingly, our implementation of NCA using SGD and without dimensionality reduction already achieves decent performance on tabular data.<n>We conclude our paper by analyzing the factors behind these improvements, including loss functions, prediction strategies, and deep architectures.
arXiv Detail & Related papers (2024-07-03T16:38:57Z)
TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks [90.00817095558094]
Prior-data fitted networks (PFNs) make use of pretraining and in-context learning to achieve strong performance on new tasks in a single forward pass. We introduce TuneTables, a parameter-efficient fine-tuning strategy for PFNs that compresses large datasets into a smaller learned context. We show that TuneTables can be used as an interpretability tool and can even be used to mitigate biases by optimizing a fairness objective.
arXiv Detail & Related papers (2024-02-17T00:02:23Z)
In-Context Data Distillation with TabPFN [11.553950697974825]
In-context data distillation (ICD) is a novel methodology that effectively eliminates these constraints by optimizing TabPFN's context. ICD efficiently enables TabPFN to handle significantly larger datasets with a fixed memory budget, improving TabPFN's quadratic memory complexity but at the cost of a linear number of tuning steps.
arXiv Detail & Related papers (2024-02-10T15:23:45Z)
Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need [84.3507610522086]
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Recent pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. We argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring.
arXiv Detail & Related papers (2023-03-13T17:59:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.