Bridge the Gap between Language models and Tabular Understanding
- URL: http://arxiv.org/abs/2302.09302v1
- Date: Thu, 16 Feb 2023 15:16:55 GMT
- Title: Bridge the Gap between Language models and Tabular Understanding
- Authors: Nuo Chen, Linjun Shou, Ming Gong, Jian Pei, Chenyu You, Jianhui Chang,
Daxin Jiang, Jia Li
- Abstract summary: Table pretrain-then-finetune paradigm has been proposed and employed at a rapid pace after the success of pre-training in the natural language domain.
Despite the promising findings, there is an input gap between pre-training and fine-tuning phases.
We propose UTP, an approach that dynamically supports three types of multi-modal inputs: table-text, table, and text.
- Score: 99.88470271644894
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Table pretrain-then-finetune paradigm has been proposed and employed at a
rapid pace after the success of pre-training in the natural language domain.
Despite the promising findings in tabular pre-trained language models (TPLMs),
there is an input gap between pre-training and fine-tuning phases. For
instance, TPLMs jointly pre-trained with table and text input could be
effective for tasks also with table-text joint input like table question
answering, but it may fail for tasks with only tables or text as input such as
table retrieval. To this end, we propose UTP, an approach that dynamically
supports three types of multi-modal inputs: table-text, table, and text.
Specifically, UTP is pre-trained with two strategies: (1) We first utilize a
universal mask language modeling objective on each kind of input, enforcing the
model to adapt various inputs. (2) We then present Cross-Modal Contrastive
Regularization (CMCR), which utilizes contrastive learning to encourage the
consistency between table-text cross-modality representations via unsupervised
instance-wise training signals during pre-training. By these means, the
resulting model not only bridges the input gap between pre-training and
fine-tuning but also advances in the alignment of table and text. Extensive
results show UTP achieves superior results on uni-modal input tasks (e.g.,
table retrieval) and cross-modal input tasks (e.g., table question answering).
Related papers
- TDeLTA: A Light-weight and Robust Table Detection Method based on
Learning Text Arrangement [34.73880086005418]
We propose a novel, light-weighted and robust Table Detection method based on Learning Text Arrangement, namely TDeLTA.
To locate the tables precisely, we design a text-classification task, classifying the text blocks into 4 categories according to their semantic roles in the tables.
Compared to several state-of-the-art methods, TDeLTA achieves competitive results with only 3.1M model parameters on the large-scale public datasets.
arXiv Detail & Related papers (2023-12-18T09:18:43Z) - TAP4LLM: Table Provider on Sampling, Augmenting, and Packing
Semi-structured Data for Large Language Model Reasoning [58.11442663694328]
We propose TAP4LLM as a versatile pre-processing toolbox to generate table prompts.
In each module, we collect and design several common methods for usage in various scenarios.
arXiv Detail & Related papers (2023-12-14T15:37:04Z) - FLIP: Towards Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction [49.510163437116645]
We propose to conduct Fine-grained feature-level ALignment between ID-based Models and Pretrained Language Models (FLIP) for click-through rate (CTR) prediction.
Specifically, the masked data of one modality (i.e., tokens or features) has to be recovered with the help of the other modality, which establishes the feature-level interaction and alignment.
Experiments on three real-world datasets demonstrate that FLIP outperforms SOTA baselines, and is highly compatible for various ID-based models and PLMs.
arXiv Detail & Related papers (2023-10-30T11:25:03Z) - Towards Table-to-Text Generation with Pretrained Language Model: A Table
Structure Understanding and Text Deliberating Approach [60.03002572791552]
We propose a table structure understanding and text deliberating approach, namely TASD.
Specifically, we devise a three-layered multi-head attention network to realize the table-structure-aware text generation model.
Our approach can generate faithful and fluent descriptive texts for different types of tables.
arXiv Detail & Related papers (2023-01-05T14:03:26Z) - OmniTab: Pretraining with Natural and Synthetic Data for Few-shot
Table-based Question Answering [106.73213656603453]
We develop a simple table-based QA model with minimal annotation effort.
We propose an omnivorous pretraining approach that consumes both natural and synthetic data.
arXiv Detail & Related papers (2022-07-08T01:23:45Z) - Table Pre-training: A Survey on Model Architectures, Pretraining
Objectives, and Downstream Tasks [37.35651138851127]
A flurry of table pre-training frameworks have been proposed following the success of text and images.
Table pre-training usually takes the form of table-text joint pre-training.
This survey aims to provide a comprehensive review of different model designs, pre-training objectives, and downstream tasks for table pre-training.
arXiv Detail & Related papers (2022-01-24T15:22:24Z) - ST-BERT: Cross-modal Language Model Pre-training For End-to-end Spoken
Language Understanding [23.367329217151084]
We introduce a cross-modal pre-trained language model, called Speech-Text BERT (ST-BERT), to tackle end-to-end spoken language understanding tasks.
Taking phoneme posterior and subword-level text as an input, ST-BERT learns a contextualized cross-modal alignment.
Our method shows further SLU performance gain via domain-adaptive pre-training with domain-specific speech-text pair data.
arXiv Detail & Related papers (2020-10-23T10:28:20Z) - GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing.
We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar.
To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z) - TAPAS: Weakly Supervised Table Parsing via Pre-training [16.661382998729067]
We present TAPAS, an approach to question answering over tables without generating logical forms.
We experiment with three different semantic parsing datasets.
We find that TAPAS outperforms or rivals semantic parsing models by improving state-of-the-art accuracy.
arXiv Detail & Related papers (2020-04-05T23:18:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.