Making Pre-trained Language Models Great on Tabular Prediction
- URL: http://arxiv.org/abs/2403.01841v2
- Date: Tue, 12 Mar 2024 07:34:28 GMT
- Title: Making Pre-trained Language Models Great on Tabular Prediction
- Authors: Jiahuan Yan, Bo Zheng, Hongxia Xu, Yiheng Zhu, Danny Z. Chen, Jimeng
Sun, Jian Wu, Jintai Chen
- Abstract summary: The transferability of deep neural networks (DNNs) has made significant progress in image and language processing.
We present TP-BERTa, a specifically pre-trained LM for tabular data prediction.
A novel relative magnitude tokenization converts scalar numerical feature values to finely discrete, high-dimensional tokens, and an intra-feature attention approach integrates feature values with the corresponding feature names.
- Score: 50.70574370855663
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The transferability of deep neural networks (DNNs) has made significant
progress in image and language processing. However, due to the heterogeneity
among tables, such DNN bonus is still far from being well exploited on tabular
data prediction (e.g., regression or classification tasks). Condensing
knowledge from diverse domains, language models (LMs) possess the capability to
comprehend feature names from various tables, potentially serving as versatile
learners in transferring knowledge across distinct tables and diverse
prediction tasks, but their discrete text representation space is inherently
incompatible with numerical feature values in tables. In this paper, we present
TP-BERTa, a specifically pre-trained LM for tabular data prediction.
Concretely, a novel relative magnitude tokenization converts scalar numerical
feature values to finely discrete, high-dimensional tokens, and an
intra-feature attention approach integrates feature values with the
corresponding feature names. Comprehensive experiments demonstrate that our
pre-trained TP-BERTa leads the performance among tabular DNNs and is
competitive with Gradient Boosted Decision Tree models in typical tabular data
regime.
Related papers
- A Survey on Deep Tabular Learning [0.0]
Tabular data presents unique challenges for deep learning due to its heterogeneous nature and lack of spatial structure.
This survey reviews the evolution of deep learning models for Tabular data, from early fully connected networks (FCNs) to advanced architectures like TabNet, SAINT, TabTranSELU, and MambaNet.
arXiv Detail & Related papers (2024-10-15T20:08:08Z) - InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation [7.67293014317639]
We propose a variant of the TabNet model that models the attention mechanism as a latent variable sampled from a Gumbel-Softmax distribution.
This enables us to regularize the model to learn distinct concepts in the attention masks via a KL Divergence regularizer.
It prevents overlapping feature selection by promoting sparsity which maximizes the model's efficacy and improves interpretability.
arXiv Detail & Related papers (2024-06-01T12:48:11Z) - Training-Free Generalization on Heterogeneous Tabular Data via
Meta-Representation [67.30538142519067]
We propose Tabular data Pre-Training via Meta-representation (TabPTM)
A deep neural network is then trained to associate these meta-representations with dataset-specific classification confidences.
Experiments validate that TabPTM achieves promising performance in new datasets, even under few-shot scenarios.
arXiv Detail & Related papers (2023-10-31T18:03:54Z) - Unlocking the Transferability of Tokens in Deep Models for Tabular Data [67.11727608815636]
Fine-tuning a pre-trained deep neural network has become a successful paradigm in various machine learning tasks.
In this paper, we propose TabToken, a method aims at enhancing the quality of feature tokens.
We introduce a contrastive objective that regularizes the tokens, capturing the semantics within and across features.
arXiv Detail & Related papers (2023-10-23T17:53:09Z) - Transfer Learning with Deep Tabular Models [66.67017691983182]
We show that upstream data gives tabular neural networks a decisive advantage over GBDT models.
We propose a realistic medical diagnosis benchmark for tabular transfer learning.
We propose a pseudo-feature method for cases where the upstream and downstream feature sets differ.
arXiv Detail & Related papers (2022-06-30T14:24:32Z) - SubTab: Subsetting Features of Tabular Data for Self-Supervised
Representation Learning [5.5616364225463055]
We introduce a new framework, Subsetting features of Tabular data (SubTab)
In this paper, we introduce a new framework, Subsetting features of Tabular data (SubTab)
We argue that reconstructing the data from the subset of its features rather than its corrupted version in an autoencoder setting can better capture its underlying representation.
arXiv Detail & Related papers (2021-10-08T20:11:09Z) - TabGNN: Multiplex Graph Neural Network for Tabular Data Prediction [43.35301059378836]
We propose a novel framework TabGNN based on recently popular graph neural networks (GNN)
Specifically, we firstly construct a multiplex graph to model the multifaceted sample relations, and then design a multiplex graph neural network to learn enhanced representation for each sample.
Experiments on eleven TDP datasets from various domains, including classification and regression ones, show that TabGNN can consistently improve the performance.
arXiv Detail & Related papers (2021-08-20T11:51:32Z) - GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing.
We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar.
To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z) - TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data [113.29476656550342]
We present TaBERT, a pretrained LM that jointly learns representations for NL sentences and tables.
TaBERT is trained on a large corpus of 26 million tables and their English contexts.
Implementation of the model will be available at http://fburl.com/TaBERT.
arXiv Detail & Related papers (2020-05-17T17:26:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.