Sources of Noise in Dialogue and How to Deal with Them
- URL: http://arxiv.org/abs/2212.02745v2
- Date: Sat, 29 Jul 2023 01:52:31 GMT
- Title: Sources of Noise in Dialogue and How to Deal with Them
- Authors: Derek Chen, Zhou Yu
- Abstract summary: Training dialogue systems often entails dealing with noisy training examples and unexpected user inputs.
Despite their prevalence, there currently lacks an accurate survey of dialogue noise.
This paper addresses this gap by first constructing a taxonomy of noise encountered by dialogue systems.
- Score: 63.02707014103651
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training dialogue systems often entails dealing with noisy training examples
and unexpected user inputs. Despite their prevalence, there currently lacks an
accurate survey of dialogue noise, nor is there a clear sense of the impact of
each noise type on task performance. This paper addresses this gap by first
constructing a taxonomy of noise encountered by dialogue systems. In addition,
we run a series of experiments to show how different models behave when
subjected to varying levels of noise and types of noise. Our results reveal
that models are quite robust to label errors commonly tackled by existing
denoising algorithms, but that performance suffers from dialogue-specific
noise. Driven by these observations, we design a data cleaning algorithm
specialized for conversational settings and apply it as a proof-of-concept for
targeted dialogue denoising.
Related papers
- Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning [55.2480439325792]
Large audio-language models (LALMs) have shown impressive capabilities in understanding and reasoning about audio and speech information.
These models still face challenges, including hallucinating non-existent sound events, misidentifying the order of sound events, and incorrectly attributing sound sources.
arXiv Detail & Related papers (2024-10-21T15:55:27Z) - Noise-BERT: A Unified Perturbation-Robust Framework with Noise Alignment
Pre-training for Noisy Slot Filling Task [14.707646721729228]
In a realistic dialogue system, the input information from users is often subject to various types of input perturbations.
We propose Noise-BERT, a unified Perturbation-Robust Framework with Noise Alignment Pre-training.
Our framework incorporates two Noise Alignment Pre-training tasks: Slot Masked Prediction and Sentence Noisiness Discrimination.
arXiv Detail & Related papers (2024-02-22T12:39:50Z) - Understanding the Effect of Noise in LLM Training Data with Algorithmic
Chains of Thought [0.0]
We study how noise in chain of thought impacts task performance in highly-controlled setting.
We define two types of noise: textitstatic noise, a local form of noise which is applied after the CoT trace is computed, and textitdynamic noise, a global form of noise which propagates errors in the trace as it is computed.
We find fine-tuned models are extremely robust to high levels of static noise but struggle significantly more with lower levels of dynamic noise.
arXiv Detail & Related papers (2024-02-06T13:59:56Z) - A Unified Framework for Connecting Noise Modeling to Boost Noise
Detection [23.366524390302608]
Noisy labels can impair model performance.
Two conventional approaches are noise modeling and noise detection.
We propose an interconnected structure with three crucial blocks: noise modeling, source knowledge identification, and enhanced noise detection.
arXiv Detail & Related papers (2023-11-30T19:24:47Z) - Continuous Modeling of the Denoising Process for Speech Enhancement
Based on Deep Learning [61.787485727134424]
We use a state variable to indicate the denoising process.
A UNet-like neural network learns to estimate every state variable sampled from the continuous denoising process.
Experimental results indicate that preserving a small amount of noise in the clean target benefits speech enhancement.
arXiv Detail & Related papers (2023-09-17T13:27:11Z) - DiffSED: Sound Event Detection with Denoising Diffusion [70.18051526555512]
We reformulate the SED problem by taking a generative learning perspective.
Specifically, we aim to generate sound temporal boundaries from noisy proposals in a denoising diffusion process.
During training, our model learns to reverse the noising process by converting noisy latent queries to the groundtruth versions.
arXiv Detail & Related papers (2023-08-14T17:29:41Z) - An Investigation of Noise in Morphological Inflection [21.411766936034]
We investigate the types of noise encountered within a pipeline for truly unsupervised morphological paradigm completion.
We compare the effect of different types of noise on multiple state-of-the-art inflection models.
We propose a novel character-level masked language modeling (CMLM) pretraining objective and explore its impact on the models' resistance to noise.
arXiv Detail & Related papers (2023-05-26T02:14:34Z) - Inference and Denoise: Causal Inference-based Neural Speech Enhancement [83.4641575757706]
This study addresses the speech enhancement (SE) task within the causal inference paradigm by modeling the noise presence as an intervention.
The proposed causal inference-based speech enhancement (CISE) separates clean and noisy frames in an intervened noisy speech using a noise detector and assigns both sets of frames to two mask-based enhancement modules (EMs) to perform noise-conditional SE.
arXiv Detail & Related papers (2022-11-02T15:03:50Z) - Adaptive noise imitation for image denoising [58.21456707617451]
We develop a new textbfadaptive noise imitation (ADANI) algorithm that can synthesize noisy data from naturally noisy images.
To produce realistic noise, a noise generator takes unpaired noisy/clean images as input, where the noisy image is a guide for noise generation.
Coupling the noisy data output from ADANI with the corresponding ground-truth, a denoising CNN is then trained in a fully-supervised manner.
arXiv Detail & Related papers (2020-11-30T02:49:36Z) - Dynamic Layer Customization for Noise Robust Speech Emotion Recognition
in Heterogeneous Condition Training [16.807298318504156]
We show that we can improve performance by dynamically routing samples to specialized feature encoders for each noise condition.
We extend these improvements to the multimodal setting by dynamically routing samples to maintain temporal ordering.
arXiv Detail & Related papers (2020-10-21T18:07:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.