Exploration and Exploitation: Two Ways to Improve Chinese Spelling
Correction Models
- URL: http://arxiv.org/abs/2105.14813v2
- Date: Tue, 1 Jun 2021 15:18:14 GMT
- Title: Exploration and Exploitation: Two Ways to Improve Chinese Spelling
Correction Models
- Authors: Chong Li, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang
- Abstract summary: We propose a method, which continually identifies the weak spots of a model to generate more valuable training instances.
Experimental results show that such an adversarial training method combined with the pretraining strategy can improve both the generalization and robustness of multiple CSC models.
- Score: 51.744357472072416
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: A sequence-to-sequence learning with neural networks has empirically proven
to be an effective framework for Chinese Spelling Correction (CSC), which takes
a sentence with some spelling errors as input and outputs the corrected one.
However, CSC models may fail to correct spelling errors covered by the
confusion sets, and also will encounter unseen ones. We propose a method, which
continually identifies the weak spots of a model to generate more valuable
training instances, and apply a task-specific pre-training strategy to enhance
the model. The generated adversarial examples are gradually added to the
training set. Experimental results show that such an adversarial training
method combined with the pretraining strategy can improve both the
generalization and robustness of multiple CSC models across three different
datasets, achieving stateof-the-art performance for CSC task.
Related papers
- EdaCSC: Two Easy Data Augmentation Methods for Chinese Spelling Correction [0.0]
Chinese Spelling Correction (CSC) aims to detect and correct spelling errors in Chinese sentences caused by phonetic or visual similarities.
We propose two data augmentation methods to address these limitations.
Firstly, we augment the dataset by either splitting long sentences into shorter ones or reducing typos in sentences with multiple typos.
arXiv Detail & Related papers (2024-09-08T14:29:10Z) - Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Self-Regularization [77.62516752323207]
We introduce an orthogonal fine-tuning method for efficiently fine-tuning pretrained weights and enabling enhanced robustness and generalization.
A self-regularization strategy is further exploited to maintain the stability in terms of zero-shot generalization of VLMs, dubbed OrthSR.
For the first time, we revisit the CLIP and CoOp with our method to effectively improve the model on few-shot image classficiation scenario.
arXiv Detail & Related papers (2024-07-11T10:35:53Z) - Chinese Spelling Correction as Rephrasing Language Model [63.65217759957206]
We study Chinese Spelling Correction (CSC), which aims to detect and correct the potential spelling errors in a given sentence.
Current state-of-the-art methods regard CSC as a sequence tagging task and fine-tune BERT-based models on sentence pairs.
We propose Rephrasing Language Model (ReLM), where the model is trained to rephrase the entire sentence by infilling additional slots, instead of character-to-character tagging.
arXiv Detail & Related papers (2023-08-17T06:04:28Z) - SCAT: Robust Self-supervised Contrastive Learning via Adversarial
Training for Text Classification [15.932462099791307]
We propose a novel learning framework called SCAT (Self-supervised Contrastive Learning via Adversarial Training)
SCAT modifies random augmentations of the data in a fully labelfree manner to generate adversarial examples.
Our results show that SCAT can not only train robust language models from scratch, but it can also significantly improve the robustness of existing pre-trained language models.
arXiv Detail & Related papers (2023-07-04T05:41:31Z) - Type-Driven Multi-Turn Corrections for Grammatical Error Correction [46.34114495164071]
Grammatical Error Correction (GEC) aims to automatically detect and correct grammatical errors.
Previous studies mainly focus on the data augmentation approach to combat the exposure bias.
We propose a Type-Driven Multi-Turn Corrections approach for GEC.
arXiv Detail & Related papers (2022-03-17T07:30:05Z) - The Past Mistake is the Future Wisdom: Error-driven Contrastive
Probability Optimization for Chinese Spell Checking [32.8563506271794]
Chinese Spell Checking (CSC) aims to detect and correct Chinese spelling errors.
Pre-trained language models (PLMs) promote the progress of CSC task.
We propose an Error-driven COntrastive Probability Optimization framework for CSC task.
arXiv Detail & Related papers (2022-03-02T09:58:56Z) - Unsupervised Class-Incremental Learning Through Confusion [0.4604003661048266]
We introduce a novelty detection method that leverages network confusion caused by training incoming data as a new class.
We found that incorporating a class-imbalance during this detection method substantially enhances performance.
arXiv Detail & Related papers (2021-04-09T15:58:43Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z) - Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness.
We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z) - A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack
and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples.
We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples.
Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.