Related papers: Parameterizing Context: Unleashing the Power of Parameter-Efficient Fine-Tuning and In-Context Tuning for Continual Table Semantic Parsing

Parameterizing Context: Unleashing the Power of Parameter-Efficient Fine-Tuning and In-Context Tuning for Continual Table Semantic Parsing

URL: http://arxiv.org/abs/2310.04801v1
Date: Sat, 7 Oct 2023 13:40:41 GMT
Title: Parameterizing Context: Unleashing the Power of Parameter-Efficient Fine-Tuning and In-Context Tuning for Continual Table Semantic Parsing
Authors: Yongrui Chen, Shenyu Zhang, Guilin Qi, Xinnan Guo
Abstract summary: This paper introduces a novel method integrating textitcontext-efficient fine-tuning (PEFT) and textitin-adaptive tuning (ICT) for training a continual table semantic parsing. The teacher addresses the few-shot problem using ICT, which procures contextual information by demonstrating a few training examples. In turn, the student leverages the proposed PEFT framework to learn from the teacher's output distribution, and subsequently compresses and saves the contextual information to the prompts, eliminating the need to store any training examples.
Score: 13.51721352349583
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Continual table semantic parsing aims to train a parser on a sequence of tasks, where each task requires the parser to translate natural language into SQL based on task-specific tables but only offers limited training examples. Conventional methods tend to suffer from overfitting with limited supervision, as well as catastrophic forgetting due to parameter updates. Despite recent advancements that partially alleviate these issues through semi-supervised data augmentation and retention of a few past examples, the performance is still limited by the volume of unsupervised data and stored examples. To overcome these challenges, this paper introduces a novel method integrating \textit{parameter-efficient fine-tuning} (PEFT) and \textit{in-context tuning} (ICT) for training a continual table semantic parser. Initially, we present a task-adaptive PEFT framework capable of fully circumventing catastrophic forgetting, which is achieved by freezing the pre-trained model backbone and fine-tuning small-scale prompts. Building on this, we propose a teacher-student framework-based solution. The teacher addresses the few-shot problem using ICT, which procures contextual information by demonstrating a few training examples. In turn, the student leverages the proposed PEFT framework to learn from the teacher's output distribution, and subsequently compresses and saves the contextual information to the prompts, eliminating the need to store any training examples. Experimental evaluations on two benchmarks affirm the superiority of our method over prevalent few-shot and continual learning baselines across various metrics.

Related papers

Modular Prompt Learning Improves Vision-Language Models [49.132774679968456]
We propose Modular Prompt Learning (MPL) to promote the preservation of information contained in the inserted prompts. On average, our method achieves 0.7% performance gain on the base-to-new generalization task. The largest improvement on the individual dataset is 10.7%.
arXiv Detail & Related papers (2025-02-19T22:00:20Z)
Learning Task Representations from In-Context Learning [73.72066284711462]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning.<n>We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads.<n>We show that our method's effectiveness stems from aligning the distribution of the last hidden state with that of an optimally performing in-context-learned model.
arXiv Detail & Related papers (2025-02-08T00:16:44Z)
Filling Memory Gaps: Enhancing Continual Semantic Parsing via SQL Syntax Variance-Guided LLMs without Real Data Replay [5.308585520353363]
Continual Semantic Parsing (CSP) aims to train annotateds to convert natural language questions intosql across tasks with limited examples. Previous studies mitigate this challenge by replaying historical data or employing parameter-efficient tuning (PET) We propose a new Large Language Model (LLM)-Enhanced Continuous Semantic Parsing method, named LECSP, which alleviates forgetting while encouraging generalization.
arXiv Detail & Related papers (2024-12-10T07:11:49Z)
Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods [69.36397993451742]
This work introduces Context-aware Prompt Tuning (CPT), a method inspired by ICL, PT, and adversarial attacks. We modify specific context tokens, considering the unique structure of input and output formats. Inspired by adversarial attacks, we adjust the input based on the labels present in the context, focusing on minimizing, rather than maximizing, the loss.
arXiv Detail & Related papers (2024-10-22T17:45:47Z)
Semantic Parsing in Limited Resource Conditions [19.689433249830465]
The thesis explores challenges in semantic parsing, specifically focusing on scenarios with limited data and computational resources. It offers solutions using techniques like automatic data curation, knowledge transfer, active learning, and continual learning.
arXiv Detail & Related papers (2023-09-14T05:03:09Z)
How Does In-Context Learning Help Prompt Tuning? [55.78535874154915]
Fine-tuning large language models is becoming ever more impractical due to their rapidly-growing scale. This motivates the use of parameter-efficient adaptation methods such as prompt tuning (PT), which adds a small number of tunable embeddings to an otherwise frozen model. Recently, Singhal et al. (2022) propose instruction prompt tuning'' (IPT), which combines PT with ICL by concatenating a natural language demonstration with learned prompt embeddings.
arXiv Detail & Related papers (2023-02-22T17:45:12Z)
Stabilized In-Context Learning with Pre-trained Language Models for Few Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks. For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial. We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z)
Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning [150.17907456113537]
We present Tabular Math Word Problems (TabMWP), a new dataset containing 38,431 grade-level problems that require mathematical reasoning. We evaluate different pre-trained models on TabMWP, including the GPT-3 model in a few-shot setting. We propose a novel approach, PromptPG, which utilizes policy gradient to learn to select in-context examples from a small amount of training data.
arXiv Detail & Related papers (2022-09-29T08:01:04Z)
Leveraging Natural Supervision for Language Representation Learning and Generation [8.083109555490475]
We describe three lines of work that seek to improve the training and evaluation of neural models using naturally-occurring supervision. We first investigate self-supervised training losses to help enhance the performance of pretrained language models for various NLP tasks. We propose a framework that uses paraphrase pairs to disentangle semantics and syntax in sentence representations.
arXiv Detail & Related papers (2022-07-21T17:26:03Z)
Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts. IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z)
Making Pre-trained Language Models End-to-end Few-shot Learners with Contrastive Prompt Tuning [41.15017636192417]
We present CP-Tuning, the first end-to-end Contrastive Prompt Tuning framework for fine-tuning Language Models. It is integrated with the task-invariant continuous prompt encoding technique with fully trainable prompt parameters. Experiments over a variety of language understanding tasks used in IR systems and different PLMs show that CP-Tuning outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-04-01T02:24:24Z)
In-Context Learning for Few-Shot Dialogue State Tracking [55.91832381893181]
We propose an in-context (IC) learning framework for few-shot dialogue state tracking (DST) A large pre-trained language model (LM) takes a test instance and a few annotated examples as input, and directly decodes the dialogue states without any parameter updates. This makes the LM more flexible and scalable compared to prior few-shot DST work when adapting to new domains and scenarios.
arXiv Detail & Related papers (2022-03-16T11:58:24Z)
Hierarchical Multitask Learning Approach for BERT [0.36525095710982913]
BERT learns embeddings by solving two tasks, which are masked language model (masked LM) and the next sentence prediction (NSP) We adopt hierarchical multitask learning approaches for BERT pre-training. Our results show that imposing a task hierarchy in pre-training improves the performance of embeddings.
arXiv Detail & Related papers (2020-10-17T09:23:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.