Bridge the Gap between Language models and Tabular Understanding
- URL: http://arxiv.org/abs/2302.09302v1
- Date: Thu, 16 Feb 2023 15:16:55 GMT
- Title: Bridge the Gap between Language models and Tabular Understanding
- Authors: Nuo Chen, Linjun Shou, Ming Gong, Jian Pei, Chenyu You, Jianhui Chang,
Daxin Jiang, Jia Li
- Abstract summary: Table pretrain-then-finetune paradigm has been proposed and employed at a rapid pace after the success of pre-training in the natural language domain.
Despite the promising findings, there is an input gap between pre-training and fine-tuning phases.
We propose UTP, an approach that dynamically supports three types of multi-modal inputs: table-text, table, and text.
- Score: 99.88470271644894
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Table pretrain-then-finetune paradigm has been proposed and employed at a
rapid pace after the success of pre-training in the natural language domain.
Despite the promising findings in tabular pre-trained language models (TPLMs),
there is an input gap between pre-training and fine-tuning phases. For
instance, TPLMs jointly pre-trained with table and text input could be
effective for tasks also with table-text joint input like table question
answering, but it may fail for tasks with only tables or text as input such as
table retrieval. To this end, we propose UTP, an approach that dynamically
supports three types of multi-modal inputs: table-text, table, and text.
Specifically, UTP is pre-trained with two strategies: (1) We first utilize a
universal mask language modeling objective on each kind of input, enforcing the
model to adapt various inputs. (2) We then present Cross-Modal Contrastive
Regularization (CMCR), which utilizes contrastive learning to encourage the
consistency between table-text cross-modality representations via unsupervised
instance-wise training signals during pre-training. By these means, the
resulting model not only bridges the input gap between pre-training and
fine-tuning but also advances in the alignment of table and text. Extensive
results show UTP achieves superior results on uni-modal input tasks (e.g.,
table retrieval) and cross-modal input tasks (e.g., table question answering).
Related papers
- TDeLTA: A Light-weight and Robust Table Detection Method based on
Learning Text Arrangement [34.73880086005418]
We propose a novel, light-weighted and robust Table Detection method based on Learning Text Arrangement, namely TDeLTA.
To locate the tables precisely, we design a text-classification task, classifying the text blocks into 4 categories according to their semantic roles in the tables.
Compared to several state-of-the-art methods, TDeLTA achieves competitive results with only 3.1M model parameters on the large-scale public datasets.
arXiv Detail & Related papers (2023-12-18T09:18:43Z) - FLIP: Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction [49.510163437116645]
Click-through rate (CTR) prediction plays as a core function module in personalized online services.
Traditional ID-based models for CTR prediction take as inputs the one-hot encoded ID features of tabular modality.
Pretrained Language Models(PLMs) has given rise to another paradigm, which takes as inputs the sentences of textual modality.
We propose to conduct Fine-grained feature-level ALignment between ID-based Models and Pretrained Language Models(FLIP) for CTR prediction.
arXiv Detail & Related papers (2023-10-30T11:25:03Z) - Towards Unifying Medical Vision-and-Language Pre-training via Soft
Prompts [63.84720380390935]
There exist two typical types, textiti.e., the fusion-encoder type and the dual-encoder type, depending on whether a heavy fusion module is used.
We propose an effective yet straightforward scheme named PTUnifier to unify the two types.
We first unify the input format by introducing visual and textual prompts, which serve as a feature bank that stores the most representative images/texts.
arXiv Detail & Related papers (2023-02-17T15:43:42Z) - Towards Table-to-Text Generation with Pretrained Language Model: A Table
Structure Understanding and Text Deliberating Approach [60.03002572791552]
We propose a table structure understanding and text deliberating approach, namely TASD.
Specifically, we devise a three-layered multi-head attention network to realize the table-structure-aware text generation model.
Our approach can generate faithful and fluent descriptive texts for different types of tables.
arXiv Detail & Related papers (2023-01-05T14:03:26Z) - OmniTab: Pretraining with Natural and Synthetic Data for Few-shot
Table-based Question Answering [106.73213656603453]
We develop a simple table-based QA model with minimal annotation effort.
We propose an omnivorous pretraining approach that consumes both natural and synthetic data.
arXiv Detail & Related papers (2022-07-08T01:23:45Z) - Table Pre-training: A Survey on Model Architectures, Pretraining
Objectives, and Downstream Tasks [37.35651138851127]
A flurry of table pre-training frameworks have been proposed following the success of text and images.
Table pre-training usually takes the form of table-text joint pre-training.
This survey aims to provide a comprehensive review of different model designs, pre-training objectives, and downstream tasks for table pre-training.
arXiv Detail & Related papers (2022-01-24T15:22:24Z) - ST-BERT: Cross-modal Language Model Pre-training For End-to-end Spoken
Language Understanding [23.367329217151084]
We introduce a cross-modal pre-trained language model, called Speech-Text BERT (ST-BERT), to tackle end-to-end spoken language understanding tasks.
Taking phoneme posterior and subword-level text as an input, ST-BERT learns a contextualized cross-modal alignment.
Our method shows further SLU performance gain via domain-adaptive pre-training with domain-specific speech-text pair data.
arXiv Detail & Related papers (2020-10-23T10:28:20Z) - GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing.
We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar.
To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z) - TAPAS: Weakly Supervised Table Parsing via Pre-training [16.661382998729067]
We present TAPAS, an approach to question answering over tables without generating logical forms.
We experiment with three different semantic parsing datasets.
We find that TAPAS outperforms or rivals semantic parsing models by improving state-of-the-art accuracy.
arXiv Detail & Related papers (2020-04-05T23:18:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.