Syntactic Data Augmentation Increases Robustness to Inference Heuristics
- URL: http://arxiv.org/abs/2004.11999v1
- Date: Fri, 24 Apr 2020 21:35:26 GMT
- Title: Syntactic Data Augmentation Increases Robustness to Inference Heuristics
- Authors: Junghyun Min, R. Thomas McCoy, Dipanjan Das, Emily Pitler, Tal Linzen
- Abstract summary: Pretrained neural models such as BERT show high accuracy on standard datasets, but a surprising lack of sensitivity to word order on controlled challenge sets.
We explore several methods to augment standard training sets with syntactically informative examples, generated by applying syntactic transformations to sentences from the MNLI corpus.
The best-performing augmentation method, subject/object inversion, improved BERT's accuracy on controlled examples that diagnose sensitivity to word order from 0.28 to 0.73, without affecting performance on the MNLI test set.
- Score: 27.513414694720716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pretrained neural models such as BERT, when fine-tuned to perform natural
language inference (NLI), often show high accuracy on standard datasets, but
display a surprising lack of sensitivity to word order on controlled challenge
sets. We hypothesize that this issue is not primarily caused by the pretrained
model's limitations, but rather by the paucity of crowdsourced NLI examples
that might convey the importance of syntactic structure at the fine-tuning
stage. We explore several methods to augment standard training sets with
syntactically informative examples, generated by applying syntactic
transformations to sentences from the MNLI corpus. The best-performing
augmentation method, subject/object inversion, improved BERT's accuracy on
controlled examples that diagnose sensitivity to word order from 0.28 to 0.73,
without affecting performance on the MNLI test set. This improvement
generalized beyond the particular construction used for data augmentation,
suggesting that augmentation causes BERT to recruit abstract syntactic
representations.
Related papers
- Contextual Biasing to Improve Domain-specific Custom Vocabulary Audio Transcription without Explicit Fine-Tuning of Whisper Model [0.0]
OpenAI's Whisper Automated Speech Recognition model excels in generalizing across diverse datasets and domains.
We propose a method to enhance transcription accuracy without explicit fine-tuning or altering model parameters.
arXiv Detail & Related papers (2024-10-24T01:58:11Z) - DASA: Difficulty-Aware Semantic Augmentation for Speaker Verification [55.306583814017046]
We present a novel difficulty-aware semantic augmentation (DASA) approach for speaker verification.
DASA generates diversified training samples in speaker embedding space with negligible extra computing cost.
The best result achieves a 14.6% relative reduction in EER metric on CN-Celeb evaluation set.
arXiv Detail & Related papers (2023-10-18T17:07:05Z) - Implicit Counterfactual Data Augmentation for Robust Learning [24.795542869249154]
This study proposes an Implicit Counterfactual Data Augmentation method to remove spurious correlations and make stable predictions.
Experiments have been conducted across various biased learning scenarios covering both image and text datasets.
arXiv Detail & Related papers (2023-04-26T10:36:40Z) - Towards preserving word order importance through Forced Invalidation [80.33036864442182]
We show that pre-trained language models are insensitive to word order.
We propose Forced Invalidation to help preserve the importance of word order.
Our experiments demonstrate that Forced Invalidation significantly improves the sensitivity of the models to word order.
arXiv Detail & Related papers (2023-04-11T13:42:10Z) - Prompting to Distill: Boosting Data-Free Knowledge Distillation via
Reinforced Prompt [52.6946016535059]
Data-free knowledge distillation (DFKD) conducts knowledge distillation via eliminating the dependence of original training data.
We propose a prompt-based method, termed as PromptDFD, that allows us to take advantage of learned language priors.
As shown in our experiments, the proposed method substantially improves the synthesis quality and achieves considerable improvements on distillation performance.
arXiv Detail & Related papers (2022-05-16T08:56:53Z) - SDA: Improving Text Generation with Self Data Augmentation [88.24594090105899]
We propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation.
Unlike most existing sentence-level augmentation strategies, our method is more general and could be easily adapted to any MLE-based training procedure.
arXiv Detail & Related papers (2021-01-02T01:15:57Z) - CAPT: Contrastive Pre-Training for Learning Denoised Sequence
Representations [42.86803751871867]
We present ContrAstive Pre-Training (CAPT) to learn noise invariant sequence representations.
CAPT encourages the consistency between representations of the original sequence and its corrupted version via unsupervised instance-wise training signals.
arXiv Detail & Related papers (2020-10-13T13:08:34Z) - Syntactic Structure Distillation Pretraining For Bidirectional Encoders [49.483357228441434]
We introduce a knowledge distillation strategy for injecting syntactic biases into BERT pretraining.
We distill the approximate marginal distribution over words in context from the syntactic LM.
Our findings demonstrate the benefits of syntactic biases, even in representation learners that exploit large amounts of data.
arXiv Detail & Related papers (2020-05-27T16:44:01Z) - Generating diverse and natural text-to-speech samples using a quantized
fine-grained VAE and auto-regressive prosody prior [53.69310441063162]
This paper proposes a sequential prior in a discrete latent space which can generate more naturally sounding samples.
We evaluate the approach using listening tests, objective metrics of automatic speech recognition (ASR) performance, and measurements of prosody attributes.
arXiv Detail & Related papers (2020-02-06T12:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.