MetaASSIST: Robust Dialogue State Tracking with Meta Learning
- URL: http://arxiv.org/abs/2210.12397v1
- Date: Sat, 22 Oct 2022 09:14:45 GMT
- Title: MetaASSIST: Robust Dialogue State Tracking with Meta Learning
- Authors: Fanghua Ye, Xi Wang, Jie Huang, Shenghui Li, Samuel Stern, Emine
Yilmaz
- Abstract summary: Existing dialogue datasets contain lots of noise in their state annotations.
General framework named ASSIST has recently been proposed to train robust dialogue state tracking (DST) models.
We propose a meta learning-based framework MetaASSIST to adaptively learn the weighting parameter.
- Score: 15.477794753452358
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing dialogue datasets contain lots of noise in their state annotations.
Such noise can hurt model training and ultimately lead to poor generalization
performance. A general framework named ASSIST has recently been proposed to
train robust dialogue state tracking (DST) models. It introduces an auxiliary
model to generate pseudo labels for the noisy training set. These pseudo labels
are combined with vanilla labels by a common fixed weighting parameter to train
the primary DST model. Notwithstanding the improvements of ASSIST on DST,
tuning the weighting parameter is challenging. Moreover, a single parameter
shared by all slots and all instances may be suboptimal. To overcome these
limitations, we propose a meta learning-based framework MetaASSIST to
adaptively learn the weighting parameter. Specifically, we propose three
schemes with varying degrees of flexibility, ranging from slot-wise to both
slot-wise and instance-wise, to convert the weighting parameter into learnable
functions. These functions are trained in a meta-learning manner by taking the
validation set as meta data. Experimental results demonstrate that all three
schemes can achieve competitive performance. Most impressively, we achieve a
state-of-the-art joint goal accuracy of 80.10% on MultiWOZ 2.4.
Related papers
- Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts [20.202031878825153]
We propose a novel dynamic data mixture for MoE instruction tuning.
Inspired by MoE's token routing preference, we build dataset-level representations and then capture the subtle differences among datasets.
Results on two MoE models demonstrate the effectiveness of our approach on both downstream knowledge & reasoning tasks and open-ended queries.
arXiv Detail & Related papers (2024-06-17T06:47:03Z) - LLM-based speaker diarization correction: A generalizable approach [0.0]
We investigate the use of large language models (LLMs) for diarization correction as a post-processing step.
The ability of the models to improve diarization accuracy in a holdout dataset from the Fisher corpus as well as an independent dataset was measured.
arXiv Detail & Related papers (2024-06-07T13:33:22Z) - Context-Aware Meta-Learning [52.09326317432577]
We propose a meta-learning algorithm that emulates Large Language Models by learning new visual concepts during inference without fine-tuning.
Our approach exceeds or matches the state-of-the-art algorithm, P>M>F, on 8 out of 11 meta-learning benchmarks.
arXiv Detail & Related papers (2023-10-17T03:35:27Z) - Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning [119.70303730341938]
We propose ePisode cUrriculum inveRsion (ECI) during data-free meta training and invErsion calibRation following inner loop (ICFIL) during meta testing.
ECI adaptively increases the difficulty level of pseudo episodes according to the real-time feedback of the meta model.
We formulate the optimization process of meta training with ECI as an adversarial form in an end-to-end manner.
arXiv Detail & Related papers (2023-03-20T15:10:41Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language
Models [152.29364079385635]
As pre-trained models grow bigger, the fine-tuning process can be time-consuming and computationally expensive.
We propose a framework for resource- and parameter-efficient fine-tuning by leveraging the sparsity prior in both weight updates and the final model weights.
Our proposed framework, dubbed Dually Sparsity-Embedded Efficient Tuning (DSEE), aims to achieve two key objectives: (i) parameter efficient fine-tuning and (ii) resource-efficient inference.
arXiv Detail & Related papers (2021-10-30T03:29:47Z) - Automatic Learning of Subword Dependent Model Scales [50.105894487730545]
We show that the model scales for a combination of attention encoder-decoder acoustic model and language model can be learned as effectively as with manual tuning.
We extend this approach to subword dependent model scales which could not be tuned manually which leads to 7% improvement on LBS and 3% on SWB.
arXiv Detail & Related papers (2021-10-18T13:48:28Z) - MESA: Boost Ensemble Imbalanced Learning with MEta-SAmpler [30.46938660561697]
We introduce a novel ensemble IL framework named MESA.
It adaptively resamples the training set in iterations to get multiple classifiers and forms a cascade ensemble model.
Unlike prevailing meta-learning-based IL solutions, we decouple the model-training and meta-training in MESA.
arXiv Detail & Related papers (2020-10-17T17:29:27Z) - A Primal-Dual Subgradient Approachfor Fair Meta Learning [23.65344558042896]
Few shot meta-learning is well-known with its fast-adapted capability and accuracy generalization onto unseen tasks.
We propose a Primal-Dual Fair Meta-learning framework, namely PDFM, which learns to train fair machine learning models using only a few examples.
arXiv Detail & Related papers (2020-09-26T19:47:38Z) - Improving Generalization in Meta-learning via Task Augmentation [69.83677015207527]
We propose two task augmentation methods, including MetaMix and Channel Shuffle.
Both MetaMix and Channel Shuffle outperform state-of-the-art results by a large margin across many datasets.
arXiv Detail & Related papers (2020-07-26T01:50:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.