Testing the Limits of Unified Sequence to Sequence LLM Pretraining on
Diverse Table Data Tasks
- URL: http://arxiv.org/abs/2310.00789v1
- Date: Sun, 1 Oct 2023 21:06:15 GMT
- Title: Testing the Limits of Unified Sequence to Sequence LLM Pretraining on
Diverse Table Data Tasks
- Authors: Soumajyoti Sarkar, Leonard Lausen
- Abstract summary: We study the advantages of a unified approach to table specific pretraining when scaled from 770M to 11B sequence to sequence models.
Our work is the first attempt at studying the advantages of a unified approach to table specific pretraining when scaled from 770M to 11B sequence to sequence models.
- Score: 2.690048852269647
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Tables stored in databases and tables which are present in web pages and
articles account for a large part of semi-structured data that is available on
the internet. It then becomes pertinent to develop a modeling approach with
large language models (LLMs) that can be used to solve diverse table tasks such
as semantic parsing, question answering as well as classification problems.
Traditionally, there existed separate models specialized for each task
individually. It raises the question of how far can we go to build a unified
model that works well on some table tasks without significant degradation on
others. To that end, we attempt at creating a shared modeling approach in the
pretraining stage with encoder-decoder style LLMs that can cater to diverse
tasks. We evaluate our approach that continually pretrains and finetunes
different model families of T5 with data from tables and surrounding context,
on these downstream tasks at different model scales. Through multiple ablation
studies, we observe that our pretraining with self-supervised objectives can
significantly boost the performance of the models on these tasks. As an example
of one improvement, we observe that the instruction finetuned public models
which come specialized on text question answering (QA) and have been trained on
table data still have room for improvement when it comes to table specific QA.
Our work is the first attempt at studying the advantages of a unified approach
to table specific pretraining when scaled from 770M to 11B sequence to sequence
models while also comparing the instruction finetuned variants of the models.
Related papers
- Towards Better Understanding Table Instruction Tuning: Decoupling the Effects from Data versus Models [62.47618742274461]
We fine-tune base models from the Mistral, OLMo, and Phi families on existing public training datasets.
Our replication achieves performance on par with or surpassing existing table LLMs.
We decouple the contributions of training data and the base model, providing insight into their individual impacts.
arXiv Detail & Related papers (2025-01-24T18:50:26Z) - Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains [114.76612918465948]
Large language models (LLMs) have achieved remarkable performance in recent years but are fundamentally limited by the underlying training data.
We propose a complementary approach towards self-improvement where finetuning is applied to a multiagent society of language models.
arXiv Detail & Related papers (2025-01-10T04:35:46Z) - TableLlama: Towards Open Large Generalist Models for Tables [22.56558262472516]
This paper makes the first step towards developing open-source large language models (LLMs) as generalists for a diversity of table-based tasks.
We construct TableInstruct, a new dataset with a variety of realistic tables and tasks, for instruction tuning and evaluating LLMs.
We further develop the first open-source generalist model for tables, TableLlama, by fine-tuning Llama 2 (7B) with LongLoRA to address the long context challenge.
arXiv Detail & Related papers (2023-11-15T18:47:52Z) - UniMASK: Unified Inference in Sequential Decision Problems [17.09745648221254]
We introduce the UniMASK framework, which provides a unified way to specify models which can be trained on many different sequential decision-making tasks.
A single UniMASK model is often capable of carrying out many tasks with performance similar to or better than single-task models.
arXiv Detail & Related papers (2022-11-20T04:54:49Z) - OmniTab: Pretraining with Natural and Synthetic Data for Few-shot
Table-based Question Answering [106.73213656603453]
We develop a simple table-based QA model with minimal annotation effort.
We propose an omnivorous pretraining approach that consumes both natural and synthetic data.
arXiv Detail & Related papers (2022-07-08T01:23:45Z) - Unifying Language Learning Paradigms [96.35981503087567]
We present a unified framework for pre-training models that are universally effective across datasets and setups.
We show how different pre-training objectives can be cast as one another and how interpolating between different objectives can be effective.
Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.
arXiv Detail & Related papers (2022-05-10T19:32:20Z) - Making Table Understanding Work in Practice [9.352813774921655]
We discuss three challenges of deploying table understanding models and propose a framework to address them.
We present SigmaTyper which encapsulates a hybrid model trained on GitTables and integrates a lightweight human-in-the-loop approach to customize the model.
arXiv Detail & Related papers (2021-09-11T03:38:24Z) - Learning to Perturb Word Embeddings for Out-of-distribution QA [55.103586220757464]
We propose a simple yet effective DA method based on a noise generator, which learns to perturb the word embedding of the input questions and context without changing their semantics.
We validate the performance of the QA models trained with our word embedding on a single source dataset, on five different target domains.
Notably, the model trained with ours outperforms the model trained with more than 240K artificially generated QA pairs.
arXiv Detail & Related papers (2021-05-06T14:12:26Z) - What do we expect from Multiple-choice QA Systems? [70.86513724662302]
We consider a top performing model on several Multiple Choice Question Answering (MCQA) datasets.
We evaluate it against a set of expectations one might have from such a model, using a series of zero-information perturbations of the model's inputs.
arXiv Detail & Related papers (2020-11-20T21:27:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.