Beyond Hard Samples: Robust and Effective Grammatical Error Correction
with Cycle Self-Augmenting
- URL: http://arxiv.org/abs/2310.13321v2
- Date: Mon, 23 Oct 2023 07:41:09 GMT
- Title: Beyond Hard Samples: Robust and Effective Grammatical Error Correction
with Cycle Self-Augmenting
- Authors: Zecheng Tang, Kaifeng Qi, Juntao Li, Min Zhang
- Abstract summary: We propose a Cycle Self-Augmenting (CSA) method to enhance the robustness of grammatical error correction (GEC) models in adversarial attacks.
By leveraging the augmenting data from the GEC models themselves in the post-training process and introducing regularization data for cycle training, our proposed method can effectively improve the model robustness of well-trained GEC models.
Experiments on four benchmark datasets and seven strong models indicate that our proposed training method can significantly enhance the robustness of four types of attacks without using purposely built adversarial examples in training.
- Score: 28.84445227362245
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent studies have revealed that grammatical error correction methods in the
sequence-to-sequence paradigm are vulnerable to adversarial attack, and simply
utilizing adversarial examples in the pre-training or post-training process can
significantly enhance the robustness of GEC models to certain types of attack
without suffering too much performance loss on clean data. In this paper, we
further conduct a thorough robustness evaluation of cutting-edge GEC methods
for four different types of adversarial attacks and propose a simple yet very
effective Cycle Self-Augmenting (CSA) method accordingly. By leveraging the
augmenting data from the GEC models themselves in the post-training process and
introducing regularization data for cycle training, our proposed method can
effectively improve the model robustness of well-trained GEC models with only a
few more training epochs as an extra cost. More concretely, further training on
the regularization data can prevent the GEC models from over-fitting on
easy-to-learn samples and thus can improve the generalization capability and
robustness towards unseen data (adversarial noise/samples). Meanwhile, the
self-augmented data can provide more high-quality pseudo pairs to improve model
performance on the original testing data. Experiments on four benchmark
datasets and seven strong models indicate that our proposed training method can
significantly enhance the robustness of four types of attacks without using
purposely built adversarial examples in training. Evaluation results on clean
data further confirm that our proposed CSA method significantly improves the
performance of four baselines and yields nearly comparable results with other
state-of-the-art models. Our code is available at
https://github.com/ZetangForward/CSA-GEC.
Related papers
- PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning [49.60634126342945]
Counterfactually Augmented Data (CAD) involves creating new data samples by applying minimal yet sufficient modifications to flip the label of existing data samples to other classes.
Recent research reveals that training with CAD may lead models to overly focus on modified features while ignoring other important contextual information.
We employ contrastive learning to promote global feature alignment in addition to learning counterfactual clues.
arXiv Detail & Related papers (2024-06-09T07:29:55Z) - Improving Data Augmentation for Robust Visual Question Answering with
Effective Curriculum Learning [12.647353699551081]
We design an Effective Curriculum Learning strategy ECL to enhance DA-based VQA methods.
ECL trains VQA models on relatively easy'' samples first, and then gradually changes to harder'' samples, and less-valuable samples are dynamically removed.
Compared to training on the entire augmented dataset, our ECL strategy can further enhance VQA models' performance with fewer training samples.
arXiv Detail & Related papers (2024-01-28T12:48:16Z) - Efficient Grammatical Error Correction Via Multi-Task Training and
Optimized Training Schedule [55.08778142798106]
We propose auxiliary tasks that exploit the alignment between the original and corrected sentences.
We formulate each task as a sequence-to-sequence problem and perform multi-task training.
We find that the order of datasets used for training and even individual instances within a dataset may have important effects on the final performance.
arXiv Detail & Related papers (2023-11-20T14:50:12Z) - Boosting Model Inversion Attacks with Adversarial Examples [26.904051413441316]
We propose a new training paradigm for a learning-based model inversion attack that can achieve higher attack accuracy in a black-box setting.
First, we regularize the training process of the attack model with an added semantic loss function.
Second, we inject adversarial examples into the training data to increase the diversity of the class-related parts.
arXiv Detail & Related papers (2023-06-24T13:40:58Z) - ESimCSE Unsupervised Contrastive Learning Jointly with UDA
Semi-Supervised Learning for Large Label System Text Classification Mode [4.708633772366381]
The ESimCSE model efficiently learns text vector representations using unlabeled data to achieve better classification results.
UDA is trained using unlabeled data through semi-supervised learning methods to improve the prediction performance of the models and stability.
adversarial training techniques FGM and PGD are used in the model training process to improve the robustness and reliability of the model.
arXiv Detail & Related papers (2023-04-19T03:44:23Z) - Towards Robust Recommender Systems via Triple Cooperative Defense [63.64651805384898]
Recommender systems are often susceptible to well-crafted fake profiles, leading to biased recommendations.
We propose a general framework, Triple Cooperative Defense, which cooperates to improve model robustness through the co-training of three models.
Results show that the robustness improvement of TCD significantly outperforms baselines.
arXiv Detail & Related papers (2022-10-25T04:45:43Z) - Exploration and Exploitation: Two Ways to Improve Chinese Spelling
Correction Models [51.744357472072416]
We propose a method, which continually identifies the weak spots of a model to generate more valuable training instances.
Experimental results show that such an adversarial training method combined with the pretraining strategy can improve both the generalization and robustness of multiple CSC models.
arXiv Detail & Related papers (2021-05-31T09:17:33Z) - A Self-Refinement Strategy for Noise Reduction in Grammatical Error
Correction [54.569707226277735]
Existing approaches for grammatical error correction (GEC) rely on supervised learning with manually created GEC datasets.
There is a non-negligible amount of "noise" where errors were inappropriately edited or left uncorrected.
We propose a self-refinement method where the key idea is to denoise these datasets by leveraging the prediction consistency of existing models.
arXiv Detail & Related papers (2020-10-07T04:45:09Z) - Data Weighted Training Strategies for Grammatical Error Correction [8.370770440898454]
We show how to incorporate delta-log-perplexity, a type of example scoring, into a training schedule for Grammatical Error Correction (GEC)
Models trained on scored data achieve state-of-the-art results on common GEC test sets.
arXiv Detail & Related papers (2020-08-07T03:30:14Z) - Generative Data Augmentation for Commonsense Reasoning [75.26876609249197]
G-DAUGC is a novel generative data augmentation method that aims to achieve more accurate and robust learning in the low-resource setting.
G-DAUGC consistently outperforms existing data augmentation methods based on back-translation.
Our analysis demonstrates that G-DAUGC produces a diverse set of fluent training examples, and that its selection and training approaches are important for performance.
arXiv Detail & Related papers (2020-04-24T06:12:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.