Unfreeze with Care: Space-Efficient Fine-Tuning of Semantic Parsing
Models
- URL: http://arxiv.org/abs/2203.02652v1
- Date: Sat, 5 Mar 2022 04:30:03 GMT
- Title: Unfreeze with Care: Space-Efficient Fine-Tuning of Semantic Parsing
Models
- Authors: Weiqi Sun, Haidar Khan, Nicolas Guenon des Mesnards, Melanie Rubino,
Konstantine Arkoudas
- Abstract summary: We examine two promising techniques, prefix tuning and bias-term tuning, specifically on semantic parsing.
We compare them against each other on two different semantic parsing datasets, and we also compare them against full and partial fine-tuning, both in few-shot and conventional data settings.
While prefix tuning is shown to do poorly for semantic parsing tasks off the shelf, we modify it by adding special token embeddings, which results in very strong performance without compromising parameter savings.
- Score: 5.893781742558463
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic parsing is a key NLP task that maps natural language to structured
meaning representations. As in many other NLP tasks, SOTA performance in
semantic parsing is now attained by fine-tuning a large pretrained language
model (PLM). While effective, this approach is inefficient in the presence of
multiple downstream tasks, as a new set of values for all parameters of the PLM
needs to be stored for each task separately. Recent work has explored methods
for adapting PLMs to downstream tasks while keeping most (or all) of their
parameters frozen. We examine two such promising techniques, prefix tuning and
bias-term tuning, specifically on semantic parsing. We compare them against
each other on two different semantic parsing datasets, and we also compare them
against full and partial fine-tuning, both in few-shot and conventional data
settings. While prefix tuning is shown to do poorly for semantic parsing tasks
off the shelf, we modify it by adding special token embeddings, which results
in very strong performance without compromising parameter savings.
Related papers
- PIP: Parse-Instructed Prefix for Syntactically Controlled Paraphrase
Generation [61.05254852400895]
Parse-Instructed Prefix (PIP) is a novel adaptation of prefix-tuning to tune large pre-trained language models.
In contrast to traditional fine-tuning methods for this task, PIP is a compute-efficient alternative with 10 times less learnable parameters.
arXiv Detail & Related papers (2023-05-26T07:42:38Z) - Towards Adaptive Prefix Tuning for Parameter-Efficient Language Model
Fine-tuning [32.84435258519842]
We propose Adaptive Prefix Tuning (APT) to adjust the prefix in terms of both fine-grained token level and coarse-grained layer level with a gate mechanism.
Experiments on the SuperGLUE and NER datasets show the effectiveness of APT.
arXiv Detail & Related papers (2023-05-24T14:51:01Z) - Pareto Manifold Learning: Tackling multiple tasks via ensembles of
single-task models [50.33956216274694]
In Multi-Task Learning (MTL), tasks may compete and limit the performance achieved on each other, rather than guiding the optimization to a solution.
We propose textitPareto Manifold Learning, an ensembling method in weight space.
arXiv Detail & Related papers (2022-10-18T11:20:54Z) - Parameter-Efficient Tuning with Special Token Adaptation [25.37998979962568]
PASTA achieves comparable performance to fine-tuning in natural language understanding tasks.
Our work demonstrates the pivotal role of special tokens in pretrained language models.
arXiv Detail & Related papers (2022-10-10T01:02:51Z) - Prompt-Matched Semantic Segmentation [96.99924127527002]
The objective of this work is to explore how to effectively adapt pre-trained foundation models to various downstream tasks of image semantic segmentation.
We propose a novel Inter-Stage Prompt-Matched Framework, which maintains the original structure of the foundation model while generating visual prompts adaptively for task-oriented tuning.
A lightweight module termed Semantic-aware Prompt Matcher is then introduced to hierarchically interpolate between two stages to learn reasonable prompts for each specific task.
arXiv Detail & Related papers (2022-08-22T09:12:53Z) - Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than
In-Context Learning [81.3514358542452]
Few-shot in-context learning (ICL) incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.
parameter-efficient fine-tuning offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task.
In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
arXiv Detail & Related papers (2022-05-11T17:10:41Z) - Training Naturalized Semantic Parsers with Very Little Data [10.709587018625275]
State-of-the-art (SOTA) semantics are seq2seq architectures based on large language models that have been pretrained on vast amounts of text.
Recent work has explored a reformulation of semantic parsing whereby the output sequences are themselves natural language sentences.
We show that this method delivers new SOTA few-shot performance on the Overnight dataset.
arXiv Detail & Related papers (2022-04-29T17:14:54Z) - Prefix-Tuning: Optimizing Continuous Prompts for Generation [85.6357778621526]
Fine-tuning is the de facto way to leverage large pretrained language models to perform downstream tasks.
We propose prefix-tuning, a lightweight alternative to fine-tuning for natural language generation tasks.
We find that by learning only 0.1% of the parameters, prefix-tuning obtains comparable performance in the full data setting.
arXiv Detail & Related papers (2021-01-01T08:00:36Z) - WARP: Word-level Adversarial ReProgramming [13.08689221166729]
In many applications it is preferable to tune much smaller sets of parameters, so that the majority of parameters can be shared across multiple tasks.
We present an alternative approach based on adversarial reprogramming, which extends earlier work on automatic prompt generation.
We show that this approach outperforms other methods with a similar number of trainable parameters on SST-2 and MNLI datasets.
arXiv Detail & Related papers (2021-01-01T00:41:03Z) - Parameter-Efficient Transfer Learning with Diff Pruning [108.03864629388404]
diff pruning is a simple approach to enable parameter-efficient transfer learning within the pretrain-finetune framework.
We find that models finetuned with diff pruning can match the performance of fully finetuned baselines on the GLUE benchmark.
arXiv Detail & Related papers (2020-12-14T12:34:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.