Parameterizing Context: Unleashing the Power of Parameter-Efficient
Fine-Tuning and In-Context Tuning for Continual Table Semantic Parsing
- URL: http://arxiv.org/abs/2310.04801v1
- Date: Sat, 7 Oct 2023 13:40:41 GMT
- Title: Parameterizing Context: Unleashing the Power of Parameter-Efficient
Fine-Tuning and In-Context Tuning for Continual Table Semantic Parsing
- Authors: Yongrui Chen, Shenyu Zhang, Guilin Qi, Xinnan Guo
- Abstract summary: This paper introduces a novel method integrating textitcontext-efficient fine-tuning (PEFT) and textitin-adaptive tuning (ICT) for training a continual table semantic parsing.
The teacher addresses the few-shot problem using ICT, which procures contextual information by demonstrating a few training examples.
In turn, the student leverages the proposed PEFT framework to learn from the teacher's output distribution, and subsequently compresses and saves the contextual information to the prompts, eliminating the need to store any training examples.
- Score: 13.51721352349583
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Continual table semantic parsing aims to train a parser on a sequence of
tasks, where each task requires the parser to translate natural language into
SQL based on task-specific tables but only offers limited training examples.
Conventional methods tend to suffer from overfitting with limited supervision,
as well as catastrophic forgetting due to parameter updates. Despite recent
advancements that partially alleviate these issues through semi-supervised data
augmentation and retention of a few past examples, the performance is still
limited by the volume of unsupervised data and stored examples. To overcome
these challenges, this paper introduces a novel method integrating
\textit{parameter-efficient fine-tuning} (PEFT) and \textit{in-context tuning}
(ICT) for training a continual table semantic parser. Initially, we present a
task-adaptive PEFT framework capable of fully circumventing catastrophic
forgetting, which is achieved by freezing the pre-trained model backbone and
fine-tuning small-scale prompts. Building on this, we propose a teacher-student
framework-based solution. The teacher addresses the few-shot problem using ICT,
which procures contextual information by demonstrating a few training examples.
In turn, the student leverages the proposed PEFT framework to learn from the
teacher's output distribution, and subsequently compresses and saves the
contextual information to the prompts, eliminating the need to store any
training examples. Experimental evaluations on two benchmarks affirm the
superiority of our method over prevalent few-shot and continual learning
baselines across various metrics.
Related papers
- Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods [69.36397993451742]
This work introduces Context-aware Prompt Tuning (CPT), a method inspired by ICL, PT, and adversarial attacks.
We modify specific context tokens, considering the unique structure of input and output formats.
Inspired by adversarial attacks, we adjust the input based on the labels present in the context, focusing on minimizing, rather than maximizing, the loss.
arXiv Detail & Related papers (2024-10-22T17:45:47Z) - Semantic Parsing in Limited Resource Conditions [19.689433249830465]
The thesis explores challenges in semantic parsing, specifically focusing on scenarios with limited data and computational resources.
It offers solutions using techniques like automatic data curation, knowledge transfer, active learning, and continual learning.
arXiv Detail & Related papers (2023-09-14T05:03:09Z) - How Does In-Context Learning Help Prompt Tuning? [55.78535874154915]
Fine-tuning large language models is becoming ever more impractical due to their rapidly-growing scale.
This motivates the use of parameter-efficient adaptation methods such as prompt tuning (PT), which adds a small number of tunable embeddings to an otherwise frozen model.
Recently, Singhal et al. (2022) propose instruction prompt tuning'' (IPT), which combines PT with ICL by concatenating a natural language demonstration with learned prompt embeddings.
arXiv Detail & Related papers (2023-02-22T17:45:12Z) - Stabilized In-Context Learning with Pre-trained Language Models for Few
Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks.
For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial.
We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z) - Dynamic Prompt Learning via Policy Gradient for Semi-structured
Mathematical Reasoning [150.17907456113537]
We present Tabular Math Word Problems (TabMWP), a new dataset containing 38,431 grade-level problems that require mathematical reasoning.
We evaluate different pre-trained models on TabMWP, including the GPT-3 model in a few-shot setting.
We propose a novel approach, PromptPG, which utilizes policy gradient to learn to select in-context examples from a small amount of training data.
arXiv Detail & Related papers (2022-09-29T08:01:04Z) - Leveraging Natural Supervision for Language Representation Learning and
Generation [8.083109555490475]
We describe three lines of work that seek to improve the training and evaluation of neural models using naturally-occurring supervision.
We first investigate self-supervised training losses to help enhance the performance of pretrained language models for various NLP tasks.
We propose a framework that uses paraphrase pairs to disentangle semantics and syntax in sentence representations.
arXiv Detail & Related papers (2022-07-21T17:26:03Z) - Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts.
IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z) - Making Pre-trained Language Models End-to-end Few-shot Learners with
Contrastive Prompt Tuning [41.15017636192417]
We present CP-Tuning, the first end-to-end Contrastive Prompt Tuning framework for fine-tuning Language Models.
It is integrated with the task-invariant continuous prompt encoding technique with fully trainable prompt parameters.
Experiments over a variety of language understanding tasks used in IR systems and different PLMs show that CP-Tuning outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-04-01T02:24:24Z) - In-Context Learning for Few-Shot Dialogue State Tracking [55.91832381893181]
We propose an in-context (IC) learning framework for few-shot dialogue state tracking (DST)
A large pre-trained language model (LM) takes a test instance and a few annotated examples as input, and directly decodes the dialogue states without any parameter updates.
This makes the LM more flexible and scalable compared to prior few-shot DST work when adapting to new domains and scenarios.
arXiv Detail & Related papers (2022-03-16T11:58:24Z) - Hierarchical Multitask Learning Approach for BERT [0.36525095710982913]
BERT learns embeddings by solving two tasks, which are masked language model (masked LM) and the next sentence prediction (NSP)
We adopt hierarchical multitask learning approaches for BERT pre-training.
Our results show that imposing a task hierarchy in pre-training improves the performance of embeddings.
arXiv Detail & Related papers (2020-10-17T09:23:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.