Related papers: Orion-Bix: Bi-Axial Attention for Tabular In-Context Learning

Orion-Bix: Bi-Axial Attention for Tabular In-Context Learning

URL: http://arxiv.org/abs/2512.00181v1
Date: Fri, 28 Nov 2025 19:42:09 GMT
Title: Orion-Bix: Bi-Axial Attention for Tabular In-Context Learning
Authors: Mohamed Bouadi, Pratinav Seth, Aditya Tanna, Vinay Kumar Sankarapu,
Abstract summary: We introduce Orion-Bix, a foundation model that combines biaxial attention with meta-learned in-context reasoning.<n>Its encoder alternates standard, grouped, hierarchical, and relational attention, fusing their outputs through multi-axial summarization.<n>A label-aware ICL head adapts on the fly and scales to large label spaces via hierarchical decision routing.
Score: 3.884856136722027
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Tabular data drive most real-world machine learning applications, yet building general-purpose models for them remains difficult. Mixed numeric and categorical fields, weak feature structure, and limited labeled data make scaling and generalization challenging. To this end, we introduce Orion-Bix, a tabular foundation model that combines biaxial attention with meta-learned in-context reasoning for few-shot tabular learning. Its encoder alternates standard, grouped, hierarchical, and relational attention, fusing their outputs through multi-CLS summarization to capture both local and global dependencies efficiently. A label-aware ICL head adapts on the fly and scales to large label spaces via hierarchical decision routing. Meta-trained on synthetically generated, structurally diverse tables with causal priors, Orion-Bix learns transferable inductive biases across heterogeneous data. Delivered as a scikit-learn compatible foundation model, it outperforms gradient-boosting baselines and remains competitive with state-of-the-art tabular foundation models on public benchmarks, showing that biaxial attention with episodic meta-training enables robust, few-shot-ready tabular learning. The model is publicly available at https://github.com/Lexsi-Labs/Orion-BiX .

Related papers

PluRel: Synthetic Data unlocks Scaling Laws for Relational Foundation Models [51.42043158297229]
We introduce Pluel, a framework to synthesize multi-tabular relational databases from scratch.<n>In a step-by-step fashion, Pluel models (1) schemas with directed graphs, (2) inter-table primary-foreign key connectivity with bipartite graphs, and, (3) feature distributions in tables via conditional causal mechanisms.
arXiv Detail & Related papers (2026-02-03T21:35:18Z)
iLTM: Integrated Large Tabular Model [41.81329403540607]
iLTM is an integrated Large Tabular Model that unifies tree-derived embeddings, dimensionality-agnostic representations, a meta-trained hypernetwork, multilayer perceptrons, and retrieval within a single architecture.
arXiv Detail & Related papers (2025-11-20T00:20:16Z)
Orion-MSP: Multi-Scale Sparse Attention for Tabular In-Context Learning [3.884856136722027]
We introduce Orion-MSP, a new ICL architecture featuring multi-scale processing, block-sparse attention, and Perceiver-style memory.<n>Orion-MSP matches or surpasses state-of-the-art performance while scaling effectively to high-dimensional tables.
arXiv Detail & Related papers (2025-11-04T18:43:44Z)
Mitra: Mixed Synthetic Priors for Enhancing Tabular Foundation Models [85.64873567417396]
We introduce Mitra, a TFM trained on a curated mixture of synthetic priors selected for their diversity, distinctiveness, and performance on real-world data.<n>Mitra consistently outperforms state-of-the-art TFMs, such as TabPFNv2 and TabICL, across both classification and regression benchmarks.
arXiv Detail & Related papers (2025-10-24T07:15:06Z)
LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence [61.46575527504109]
LimiX-16M and LimiX-2M treat structured data as a joint distribution over variables and missingness.<n>We evaluate LimiX models across 11 large structured-data benchmarks with broad regimes of sample size, feature dimensionality, class number, categorical-to-numerical feature ratio, missingness, and sample-to-feature ratios.
arXiv Detail & Related papers (2025-09-03T17:39:08Z)
Make Still Further Progress: Chain of Thoughts for Tabular Data Leaderboard [27.224577475861214]
Tabular data, a fundamental data format in machine learning, is predominantly utilized in competitions and real-world applications.<n>We propose an in-context ensemble framework for tabular prediction that leverages large language models.<n>Our method constructs a context around each test instance using its nearest neighbors and the predictions from a pool of external models.
arXiv Detail & Related papers (2025-05-19T17:52:58Z)
Representation Learning for Tabular Data: A Comprehensive Survey [23.606506938919605]
Tabular data, structured as rows and columns, is among the most prevalent data types in machine learning classification and regression applications.<n>Deep Neural Networks (DNNs) have recently demonstrated promising results through their capability of representation learning.<n>We organize existing methods into three main categories according to their generalization capabilities.
arXiv Detail & Related papers (2025-04-17T17:58:23Z)
Unbiased Max-Min Embedding Classification for Transductive Few-Shot Learning: Clustering and Classification Are All You Need [83.10178754323955]
Few-shot learning enables models to generalize from only a few labeled examples.<n>We propose the Unbiased Max-Min Embedding Classification (UMMEC) Method, which addresses the key challenges in few-shot learning.<n>Our method significantly improves classification performance with minimal labeled data, advancing the state-of-the-art in annotatedL.
arXiv Detail & Related papers (2025-03-28T07:23:07Z)
Transformers with Stochastic Competition for Tabular Data Modelling [6.285325771390289]
We introduce a novel deep learning model specifically designed for tabular data. The model is validated on a variety of widely-used, publicly available datasets. We demonstrate that, through the incorporation of these elements, our model yields high performance.
arXiv Detail & Related papers (2024-07-18T07:48:48Z)
Rethinking Pre-Training in Tabular Data: A Neighborhood Embedding Perspective [71.45945607871715]
We propose Tabular data Pre-Training via Meta-representation (TabPTM)<n>The core idea is to embed data instances into a shared feature space, where each instance is represented by its distance to a fixed number of nearest neighbors and their labels.<n>Extensive experiments on 101 datasets confirm TabPTM's effectiveness in both classification and regression tasks, with and without fine-tuning.
arXiv Detail & Related papers (2023-10-31T18:03:54Z)
Is margin all you need? An extensive empirical study of active learning on tabular data [66.18464006872345]
We analyze the performance of a variety of active learning algorithms on 69 real-world datasets from the OpenML-CC18 benchmark. Surprisingly, we find that the classical margin sampling technique matches or outperforms all others, including current state-of-art.
arXiv Detail & Related papers (2022-10-07T21:18:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.