Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt
Tuning
- URL: http://arxiv.org/abs/2301.10915v2
- Date: Tue, 30 May 2023 00:23:15 GMT
- Title: Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt
Tuning
- Authors: Mingyu Derek Ma, Jiun-Yu Kao, Shuyang Gao, Arpit Gupta, Di Jin,
Tagyoung Chung, Nanyun Peng
- Abstract summary: Dialogue state tracking (DST) is an important step in dialogue management to keep track of users' beliefs.
Existing works fine-tune all language model (LM) parameters to tackle the DST task.
We propose to use soft prompt token embeddings to learn task properties.
- Score: 57.01260458860375
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dialogue state tracking (DST) is an important step in dialogue management to
keep track of users' beliefs. Existing works fine-tune all language model (LM)
parameters to tackle the DST task, which requires significant data and
computing resources for training and hosting. The cost grows exponentially in
the real-world deployment where dozens of fine-tuned LM are used for different
domains and tasks. To reduce parameter size and better utilize cross-task
shared information, we propose to use soft prompt token embeddings to learn
task properties. Without tuning LM parameters, our method drastically reduces
the number of parameters needed to less than 0.5% of prior works while achieves
better low-resource DST performance.
Related papers
- Effectively Prompting Small-sized Language Models for Cross-lingual Tasks via Winning Tickets [2.803947848713182]
Current soft prompt methods yield limited performance when applied to small-sized models.
Deep prompt-tuning entails prepending parameters in each prompt for enhanced efficacy.
We introduce the Lottery Ticket Prompt-learning framework that integrates winning tickets with soft prompts.
arXiv Detail & Related papers (2024-04-01T17:03:16Z) - ScaLearn: Simple and Highly Parameter-Efficient Task Transfer by Learning to Scale [18.396897413970965]
ScaLearn is a simple and highly parameter-efficient two-stage MTL method.
We show that ScaLearn consistently outperforms strong baselines with a small number of transfer parameters.
arXiv Detail & Related papers (2023-10-02T14:01:36Z) - Scaled Prompt-Tuning for Few-Shot Natural Language Generation [9.399840807973545]
Large Language Models (LLMs) demonstrate stronger language understanding and generation capabilities.
Memory demand and computation cost of fine-tuning LLMs on downstream tasks are non-negligible.
We propose a Scaled Prompt-Tuning (SPT) method which surpasses conventional PT with better performance and generalization ability.
arXiv Detail & Related papers (2023-09-13T07:12:31Z) - Task-Optimized Adapters for an End-to-End Task-Oriented Dialogue System [0.0]
We propose an End-to-end TOD system with Task-d Adapters which learn independently per task, adding only small number of parameters after fixed layers of pre-trained network.
Our method is a model-agnostic approach and does not require prompt-tuning as only input data without a prompt.
arXiv Detail & Related papers (2023-05-04T00:17:49Z) - Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning [91.5113227694443]
We propose a novel visual.
sensuous-aware fine-Tuning (SPT) scheme.
SPT allocates trainable parameters to task-specific important positions.
Experiments on a wide range of downstream recognition tasks show that our SPT is complementary to the existing PEFT methods.
arXiv Detail & Related papers (2023-03-15T12:34:24Z) - Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts.
IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z) - Good Intentions: Adaptive Parameter Management via Intent Signaling [50.01012642343155]
We propose a novel intent signaling mechanism that integrates naturally into existing machine learning stacks.
We then describe AdaPM, a fully adaptive, zero-tuning parameter manager based on this mechanism.
In our evaluation, AdaPM matched or outperformed state-of-the-art parameter managers out of the box.
arXiv Detail & Related papers (2022-06-01T13:02:19Z) - Parameter-Efficient Sparsity for Large Language Models Fine-Tuning [63.321205487234074]
We propose a.
sparse-efficient Sparse Training (PST) method to reduce the number of trainable parameters during sparse-aware training.
Experiments with diverse networks (i.e., BERT, RoBERTa and GPT-2) demonstrate PST performs on par or better than previous sparsity methods.
arXiv Detail & Related papers (2022-05-23T02:43:45Z) - Dynamic Parameter Allocation in Parameter Servers [74.250687861348]
We propose to integrate dynamic parameter allocation into parameter servers, describe an efficient implementation of such a parameter server called Lapse.
We found that Lapse provides near-linear scaling and can be orders of magnitude faster than existing parameter servers.
arXiv Detail & Related papers (2020-02-03T11:37:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.