Related papers: Unfreeze with Care: Space-Efficient Fine-Tuning of Semantic Parsing Models

Unfreeze with Care: Space-Efficient Fine-Tuning of Semantic Parsing Models

URL: http://arxiv.org/abs/2203.02652v1
Date: Sat, 5 Mar 2022 04:30:03 GMT
Title: Unfreeze with Care: Space-Efficient Fine-Tuning of Semantic Parsing Models
Authors: Weiqi Sun, Haidar Khan, Nicolas Guenon des Mesnards, Melanie Rubino, Konstantine Arkoudas
Abstract summary: We examine two promising techniques, prefix tuning and bias-term tuning, specifically on semantic parsing. We compare them against each other on two different semantic parsing datasets, and we also compare them against full and partial fine-tuning, both in few-shot and conventional data settings. While prefix tuning is shown to do poorly for semantic parsing tasks off the shelf, we modify it by adding special token embeddings, which results in very strong performance without compromising parameter savings.
Score: 5.893781742558463
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Semantic parsing is a key NLP task that maps natural language to structured meaning representations. As in many other NLP tasks, SOTA performance in semantic parsing is now attained by fine-tuning a large pretrained language model (PLM). While effective, this approach is inefficient in the presence of multiple downstream tasks, as a new set of values for all parameters of the PLM needs to be stored for each task separately. Recent work has explored methods for adapting PLMs to downstream tasks while keeping most (or all) of their parameters frozen. We examine two such promising techniques, prefix tuning and bias-term tuning, specifically on semantic parsing. We compare them against each other on two different semantic parsing datasets, and we also compare them against full and partial fine-tuning, both in few-shot and conventional data settings. While prefix tuning is shown to do poorly for semantic parsing tasks off the shelf, we modify it by adding special token embeddings, which results in very strong performance without compromising parameter savings.

Related papers

PIP: Parse-Instructed Prefix for Syntactically Controlled Paraphrase Generation [61.05254852400895]
Parse-Instructed Prefix (PIP) is a novel adaptation of prefix-tuning to tune large pre-trained language models. In contrast to traditional fine-tuning methods for this task, PIP is a compute-efficient alternative with 10 times less learnable parameters.
arXiv Detail & Related papers (2023-05-26T07:42:38Z)
Towards Adaptive Prefix Tuning for Parameter-Efficient Language Model Fine-tuning [32.84435258519842]
We propose Adaptive Prefix Tuning (APT) to adjust the prefix in terms of both fine-grained token level and coarse-grained layer level with a gate mechanism. Experiments on the SuperGLUE and NER datasets show the effectiveness of APT.
arXiv Detail & Related papers (2023-05-24T14:51:01Z)
Pareto Manifold Learning: Tackling multiple tasks via ensembles of single-task models [50.33956216274694]
In Multi-Task Learning (MTL), tasks may compete and limit the performance achieved on each other, rather than guiding the optimization to a solution. We propose textitPareto Manifold Learning, an ensembling method in weight space.
arXiv Detail & Related papers (2022-10-18T11:20:54Z)
Parameter-Efficient Tuning with Special Token Adaptation [25.37998979962568]
PASTA achieves comparable performance to fine-tuning in natural language understanding tasks. Our work demonstrates the pivotal role of special tokens in pretrained language models.
arXiv Detail & Related papers (2022-10-10T01:02:51Z)
Prompt-Matched Semantic Segmentation [96.99924127527002]
The objective of this work is to explore how to effectively adapt pre-trained foundation models to various downstream tasks of image semantic segmentation. We propose a novel Inter-Stage Prompt-Matched Framework, which maintains the original structure of the foundation model while generating visual prompts adaptively for task-oriented tuning. A lightweight module termed Semantic-aware Prompt Matcher is then introduced to hierarchically interpolate between two stages to learn reasonable prompts for each specific task.
arXiv Detail & Related papers (2022-08-22T09:12:53Z)
Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning [81.3514358542452]
Few-shot in-context learning (ICL) incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made. parameter-efficient fine-tuning offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task. In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
arXiv Detail & Related papers (2022-05-11T17:10:41Z)
Training Naturalized Semantic Parsers with Very Little Data [10.709587018625275]
State-of-the-art (SOTA) semantics are seq2seq architectures based on large language models that have been pretrained on vast amounts of text. Recent work has explored a reformulation of semantic parsing whereby the output sequences are themselves natural language sentences. We show that this method delivers new SOTA few-shot performance on the Overnight dataset.
arXiv Detail & Related papers (2022-04-29T17:14:54Z)
Prefix-Tuning: Optimizing Continuous Prompts for Generation [85.6357778621526]
Fine-tuning is the de facto way to leverage large pretrained language models to perform downstream tasks. We propose prefix-tuning, a lightweight alternative to fine-tuning for natural language generation tasks. We find that by learning only 0.1% of the parameters, prefix-tuning obtains comparable performance in the full data setting.
arXiv Detail & Related papers (2021-01-01T08:00:36Z)
WARP: Word-level Adversarial ReProgramming [13.08689221166729]
In many applications it is preferable to tune much smaller sets of parameters, so that the majority of parameters can be shared across multiple tasks. We present an alternative approach based on adversarial reprogramming, which extends earlier work on automatic prompt generation. We show that this approach outperforms other methods with a similar number of trainable parameters on SST-2 and MNLI datasets.
arXiv Detail & Related papers (2021-01-01T00:41:03Z)
Parameter-Efficient Transfer Learning with Diff Pruning [108.03864629388404]
diff pruning is a simple approach to enable parameter-efficient transfer learning within the pretrain-finetune framework. We find that models finetuned with diff pruning can match the performance of fully finetuned baselines on the GLUE benchmark.
arXiv Detail & Related papers (2020-12-14T12:34:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.