Exploration and Exploitation: Two Ways to Improve Chinese Spelling
Correction Models
- URL: http://arxiv.org/abs/2105.14813v2
- Date: Tue, 1 Jun 2021 15:18:14 GMT
- Title: Exploration and Exploitation: Two Ways to Improve Chinese Spelling
Correction Models
- Authors: Chong Li, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang
- Abstract summary: We propose a method, which continually identifies the weak spots of a model to generate more valuable training instances.
Experimental results show that such an adversarial training method combined with the pretraining strategy can improve both the generalization and robustness of multiple CSC models.
- Score: 51.744357472072416
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: A sequence-to-sequence learning with neural networks has empirically proven
to be an effective framework for Chinese Spelling Correction (CSC), which takes
a sentence with some spelling errors as input and outputs the corrected one.
However, CSC models may fail to correct spelling errors covered by the
confusion sets, and also will encounter unseen ones. We propose a method, which
continually identifies the weak spots of a model to generate more valuable
training instances, and apply a task-specific pre-training strategy to enhance
the model. The generated adversarial examples are gradually added to the
training set. Experimental results show that such an adversarial training
method combined with the pretraining strategy can improve both the
generalization and robustness of multiple CSC models across three different
datasets, achieving stateof-the-art performance for CSC task.
Related papers
- Adaptive Weighted Co-Learning for Cross-Domain Few-Shot Learning [23.615250207134004]
Cross-domain few-shot learning (CDFSL) induces a very challenging adaptation problem.
We propose a simple Adaptive Weighted Co-Learning (AWCoL) method to address the CDFSL challenge.
Comprehensive experiments are conducted on multiple benchmark datasets and the empirical results demonstrate that the proposed method produces state-of-the-art CDFSL performance.
arXiv Detail & Related papers (2023-12-06T22:09:52Z) - Chinese Spelling Correction as Rephrasing Language Model [63.65217759957206]
We study Chinese Spelling Correction (CSC), which aims to detect and correct the potential spelling errors in a given sentence.
Current state-of-the-art methods regard CSC as a sequence tagging task and fine-tune BERT-based models on sentence pairs.
We propose Rephrasing Language Model (ReLM), where the model is trained to rephrase the entire sentence by infilling additional slots, instead of character-to-character tagging.
arXiv Detail & Related papers (2023-08-17T06:04:28Z) - SCAT: Robust Self-supervised Contrastive Learning via Adversarial
Training for Text Classification [15.932462099791307]
We propose a novel learning framework called SCAT (Self-supervised Contrastive Learning via Adversarial Training)
SCAT modifies random augmentations of the data in a fully labelfree manner to generate adversarial examples.
Our results show that SCAT can not only train robust language models from scratch, but it can also significantly improve the robustness of existing pre-trained language models.
arXiv Detail & Related papers (2023-07-04T05:41:31Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Type-Driven Multi-Turn Corrections for Grammatical Error Correction [46.34114495164071]
Grammatical Error Correction (GEC) aims to automatically detect and correct grammatical errors.
Previous studies mainly focus on the data augmentation approach to combat the exposure bias.
We propose a Type-Driven Multi-Turn Corrections approach for GEC.
arXiv Detail & Related papers (2022-03-17T07:30:05Z) - The Past Mistake is the Future Wisdom: Error-driven Contrastive
Probability Optimization for Chinese Spell Checking [32.8563506271794]
Chinese Spell Checking (CSC) aims to detect and correct Chinese spelling errors.
Pre-trained language models (PLMs) promote the progress of CSC task.
We propose an Error-driven COntrastive Probability Optimization framework for CSC task.
arXiv Detail & Related papers (2022-03-02T09:58:56Z) - Unsupervised Class-Incremental Learning Through Confusion [0.4604003661048266]
We introduce a novelty detection method that leverages network confusion caused by training incoming data as a new class.
We found that incorporating a class-imbalance during this detection method substantially enhances performance.
arXiv Detail & Related papers (2021-04-09T15:58:43Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z) - Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness.
We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z) - A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack
and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples.
We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples.
Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.