MoNET: Tackle State Momentum via Noise-Enhanced Training for Dialogue
State Tracking
- URL: http://arxiv.org/abs/2211.05503v3
- Date: Mon, 19 Jun 2023 03:19:05 GMT
- Title: MoNET: Tackle State Momentum via Noise-Enhanced Training for Dialogue
State Tracking
- Authors: Haoning Zhang, Junwei Bao, Haipeng Sun, Youzheng Wu, Wenye Li,
Shuguang Cui, Xiaodong He
- Abstract summary: Dialogue state tracking (DST) aims to convert the dialogue history into dialogue states which consist of slot-value pairs.
The dialogue state in the last turn is typically adopted as the input for predicting the current state by DST models.
We propose MoNET to tackle state momentum via noise-enhanced training.
- Score: 42.70799541159301
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Dialogue state tracking (DST) aims to convert the dialogue history into
dialogue states which consist of slot-value pairs. As condensed structural
information memorizing all history information, the dialogue state in the last
turn is typically adopted as the input for predicting the current state by DST
models. However, these models tend to keep the predicted slot values unchanged,
which is defined as state momentum in this paper. Specifically, the models
struggle to update slot values that need to be changed and correct wrongly
predicted slot values in the last turn. To this end, we propose MoNET to tackle
state momentum via noise-enhanced training. First, the previous state of each
turn in the training data is noised via replacing some of its slot values.
Then, the noised previous state is used as the input to learn to predict the
current state, improving the model's ability to update and correct slot values.
Furthermore, a contrastive context matching framework is designed to narrow the
representation distance between a state and its corresponding noised variant,
which reduces the impact of noised state and makes the model better understand
the dialogue history. Experimental results on MultiWOZ datasets show that MoNET
outperforms previous DST methods. Ablations and analysis verify the
effectiveness of MoNET in alleviating state momentum and improving anti-noise
ability.
Related papers
- Grounding Description-Driven Dialogue State Trackers with
Knowledge-Seeking Turns [54.56871462068126]
Augmenting the training set with human or synthetic schema paraphrases improves the model robustness to these variations but can be either costly or difficult to control.
We propose to circumvent these issues by grounding the state tracking model in knowledge-seeking turns collected from the dialogue corpus as well as the schema.
arXiv Detail & Related papers (2023-09-23T18:33:02Z) - Dialogue State Distillation Network with Inter-Slot Contrastive Learning
for Dialogue State Tracking [25.722458066685046]
Dialogue State Tracking (DST) aims to extract users' intentions from the dialogue history.
Currently, most existing approaches suffer from error propagation and are unable to dynamically select relevant information.
We propose a Dialogue State Distillation Network (DSDN) to utilize relevant information of previous dialogue states.
arXiv Detail & Related papers (2023-02-16T11:05:24Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Generating Coherent Narratives by Learning Dynamic and Discrete Entity
States with a Contrastive Framework [68.1678127433077]
We extend the Transformer model to dynamically conduct entity state updates and sentence realization for narrative generation.
Experiments on two narrative datasets show that our model can generate more coherent and diverse narratives than strong baselines.
arXiv Detail & Related papers (2022-08-08T09:02:19Z) - Value-Consistent Representation Learning for Data-Efficient
Reinforcement Learning [105.70602423944148]
We propose a novel method, called value-consistent representation learning (VCR), to learn representations that are directly related to decision-making.
Instead of aligning this imagined state with a real state returned by the environment, VCR applies a $Q$-value head on both states and obtains two distributions of action values.
It has been demonstrated that our methods achieve new state-of-the-art performance for search-free RL algorithms.
arXiv Detail & Related papers (2022-06-25T03:02:25Z) - Effective Sequence-to-Sequence Dialogue State Tracking [22.606650177804966]
We show that the choice of pre-training objective makes a significant difference to the state tracking quality.
We also explore using Pegasus, a span prediction-based pre-training objective for text summarization, for the state tracking model.
We found that pre-training for the seemingly distant summarization task works surprisingly well for dialogue state tracking.
arXiv Detail & Related papers (2021-08-31T17:27:59Z) - Oh My Mistake!: Toward Realistic Dialogue State Tracking including
Turnback Utterances [1.6099403809839035]
We study whether current benchmark datasets are sufficiently diverse to handle casual conversations in which one changes their mind.
We found that injecting template-based turnback utterances significantly degrades the DST model performance.
We also observed that the performance rebounds when a turnback is appropriately included in the training dataset.
arXiv Detail & Related papers (2021-08-28T12:10:50Z) - Neural Dialogue State Tracking with Temporally Expressive Networks [40.808421462004866]
Dialogue state tracking (DST) is an important part of a spoken dialogue system.
Existing DST models either ignore temporal feature dependencies across dialogue turns or fail to explicitly model temporal state dependencies in a dialogue.
We propose Temporally Expressive Networks (TEN) to jointly model the two types of temporal dependencies in DST.
arXiv Detail & Related papers (2020-09-16T11:53:00Z) - Non-Autoregressive Dialog State Tracking [122.2328875457225]
We propose a novel framework of Non-Autoregressive Dialog State Tracking (NADST)
NADST can factor in potential dependencies among domains and slots to optimize the models towards better prediction of dialogue states as a complete set rather than separate slots.
Our results show that our model achieves the state-of-the-art joint accuracy across all domains on the MultiWOZ 2.1 corpus.
arXiv Detail & Related papers (2020-02-19T06:39:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.