Data Augmentation for Voice-Assistant NLU using BERT-based
Interchangeable Rephrase
- URL: http://arxiv.org/abs/2104.08268v1
- Date: Fri, 16 Apr 2021 17:53:58 GMT
- Title: Data Augmentation for Voice-Assistant NLU using BERT-based
Interchangeable Rephrase
- Authors: Akhila Yerukola, Mason Bretan and Hongxia Jin
- Abstract summary: We introduce a data augmentation technique based on byte pair encoding and a BERT-like self-attention model to boost performance on spoken language understanding tasks.
We show our method performs strongly on domain and intent classification tasks for a voice assistant and in a user-study focused on utterance naturalness and semantic similarity.
- Score: 39.09474362100266
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a data augmentation technique based on byte pair encoding and a
BERT-like self-attention model to boost performance on spoken language
understanding tasks. We compare and evaluate this method with a range of
augmentation techniques encompassing generative models such as VAEs and
performance-boosting techniques such as synonym replacement and
back-translation. We show our method performs strongly on domain and intent
classification tasks for a voice assistant and in a user-study focused on
utterance naturalness and semantic similarity.
Related papers
- Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback [50.84142264245052]
This work introduces the Align-SLM framework to enhance the semantic understanding of textless Spoken Language Models (SLMs)
Our approach generates multiple speech continuations from a given prompt and uses semantic metrics to create preference data for Direct Preference Optimization (DPO)
We evaluate the framework using ZeroSpeech 2021 benchmarks for lexical and syntactic modeling, the spoken version of the StoryCloze dataset for semantic coherence, and other speech generation metrics, including the GPT4-o score and human evaluation.
arXiv Detail & Related papers (2024-11-04T06:07:53Z) - DenoSent: A Denoising Objective for Self-Supervised Sentence
Representation Learning [59.4644086610381]
We propose a novel denoising objective that inherits from another perspective, i.e., the intra-sentence perspective.
By introducing both discrete and continuous noise, we generate noisy sentences and then train our model to restore them to their original form.
Our empirical evaluations demonstrate that this approach delivers competitive results on both semantic textual similarity (STS) and a wide range of transfer tasks.
arXiv Detail & Related papers (2024-01-24T17:48:45Z) - Iterative Mask Filling: An Effective Text Augmentation Method Using
Masked Language Modeling [0.0]
We propose a novel text augmentation method that leverages the Fill-Mask feature of the transformer-based BERT model.
Our method involves iteratively masking words in a sentence and replacing them with language model predictions.
Experimental results show that our proposed method significantly improves performance, especially on topic classification datasets.
arXiv Detail & Related papers (2024-01-03T16:47:13Z) - SememeASR: Boosting Performance of End-to-End Speech Recognition against
Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge [58.979490858061745]
We introduce sememe-based semantic knowledge information to speech recognition.
Our experiments show that sememe information can improve the effectiveness of speech recognition.
In addition, our further experiments show that sememe knowledge can improve the model's recognition of long-tailed data.
arXiv Detail & Related papers (2023-09-04T08:35:05Z) - Improving Self-training for Cross-lingual Named Entity Recognition with
Contrastive and Prototype Learning [80.08139343603956]
In cross-lingual named entity recognition, self-training is commonly used to bridge the linguistic gap.
In this work, we aim to improve self-training for cross-lingual NER by combining representation learning and pseudo label refinement.
Our proposed method, namely ContProto mainly comprises two components: (1) contrastive self-training and (2) prototype-based pseudo-labeling.
arXiv Detail & Related papers (2023-05-23T02:52:16Z) - Adversarial Speaker Disentanglement Using Unannotated External Data for
Self-supervised Representation Based Voice Conversion [35.23123094710891]
We propose a high-similarity any-to-one voice conversion method with the input of SSL representations.
Experimental results show that our proposed method achieves comparable similarity and higher naturalness than the supervised method.
arXiv Detail & Related papers (2023-05-16T04:52:29Z) - Beyond Contrastive Learning: A Variational Generative Model for
Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings.
Our model operates on parallel data in $N$ languages.
We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z) - Syntax-driven Data Augmentation for Named Entity Recognition [3.0603554929274908]
In low resource settings, data augmentation strategies are commonly leveraged to improve performance.
We compare simple masked language model replacement and an augmentation method using constituency tree mutations to improve named entity recognition.
arXiv Detail & Related papers (2022-08-15T01:24:55Z) - A Comparison of Methods for OOV-word Recognition on a New Public Dataset [0.0]
We propose using the CommonVoice dataset to create test sets for languages with a high out-of-vocabulary ratio.
We then evaluate, within the context of a hybrid ASR system, how much better subword models are at recognizing OOVs.
We propose a new method for modifying a subword-based language model so as to better recognize OOV-words.
arXiv Detail & Related papers (2021-07-16T19:39:30Z) - An Effective Contextual Language Modeling Framework for Speech
Summarization with Augmented Features [13.97006782398121]
Bidirectional Representations from Transformers (BERT) model was proposed and has achieved record-breaking success on many natural language processing tasks.
We explore the incorporation of confidence scores into sentence representations to see if such an attempt could help alleviate the negative effects caused by imperfect automatic speech recognition.
We validate the effectiveness of our proposed method on a benchmark dataset.
arXiv Detail & Related papers (2020-06-01T18:27:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.