Correct Like Humans: Progressive Learning Framework for Chinese Text Error Correction
- URL: http://arxiv.org/abs/2306.17447v3
- Date: Wed, 20 Mar 2024 15:53:37 GMT
- Title: Correct Like Humans: Progressive Learning Framework for Chinese Text Error Correction
- Authors: Yinghui Li, Shirong Ma, Shaoshen Chen, Haojing Huang, Shulin Huang, Yangning Li, Hai-Tao Zheng, Ying Shen,
- Abstract summary: Chinese Text Error Correction (CTEC) aims to detect and correct errors in the input text.
Recent approaches mainly employ Pre-trained Language Models (PLMs) to resolve CTEC.
We propose a novel model-agnostic progressive learning framework, named ProTEC, which guides PLMs-based CTEC models to learn to correct like humans.
- Score: 28.25789161365667
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Chinese Text Error Correction (CTEC) aims to detect and correct errors in the input text, which benefits human daily life and various downstream tasks. Recent approaches mainly employ Pre-trained Language Models (PLMs) to resolve CTEC. Although PLMs have achieved remarkable success in CTEC, we argue that previous studies still overlook the importance of human thinking patterns. To enhance the development of PLMs for CTEC, inspired by humans' daily error-correcting behavior, we propose a novel model-agnostic progressive learning framework, named ProTEC, which guides PLMs-based CTEC models to learn to correct like humans. During the training process, ProTEC guides the model to learn text error correction by incorporating these sub-tasks into a progressive paradigm. During the inference process, the model completes these sub-tasks in turn to generate the correction results. Extensive experiments and detailed analyses demonstrate the effectiveness and efficiency of our proposed model-agnostic ProTEC framework.
Related papers
- Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration [74.09687562334682]
We introduce a novel training data attribution method called Debias and Denoise Attribution (DDA)
Our method significantly outperforms existing approaches, achieving an averaged AUC of 91.64%.
DDA exhibits strong generality and scalability across various sources and different-scale models like LLaMA2, QWEN2, and Mistral.
arXiv Detail & Related papers (2024-10-02T07:14:26Z) - HiDe-PET: Continual Learning via Hierarchical Decomposition of Parameter-Efficient Tuning [55.88910947643436]
We propose a unified framework for continual learning (CL) with pre-trained models (PTMs) and parameter-efficient tuning (PET)
We present Hierarchical Decomposition PET (HiDe-PET), an innovative approach that explicitly optimize the objective through incorporating task-specific and task-shared knowledge.
Our approach demonstrates remarkably superior performance over a broad spectrum of recent strong baselines.
arXiv Detail & Related papers (2024-07-07T01:50:25Z) - Standardizing Your Training Process for Human Activity Recognition
Models: A Comprehensive Review in the Tunable Factors [4.199844472131922]
We provide an exhaustive review of contemporary deep learning research in the field of wearable human activity recognition (WHAR)
Our findings suggest that a major trend is the lack of detail provided by model training protocols.
With insights from the analyses, we define a novel integrated training procedure tailored to the WHAR model.
arXiv Detail & Related papers (2024-01-10T17:45:28Z) - The Right Prompts for the Job: Repair Code-Review Defects with Large
Language Model [15.885824575879763]
Automatic program repair (APR) techniques have the potential to reduce manual efforts in uncovering and repairing program defects during the code review (CR) process.
However, the limited accuracy and considerable time costs associated with existing APR approaches hinder their adoption in industrial practice.
Recent advancements in Large Language Models (LLMs) have enhanced their ability to comprehend natural and programming languages, enabling them to generate patches based on review comments.
arXiv Detail & Related papers (2023-12-29T06:12:15Z) - Secrets of RLHF in Large Language Models Part I: PPO [81.01936993929127]
Large language models (LLMs) have formulated a blueprint for the advancement of artificial general intelligence.
reinforcement learning with human feedback (RLHF) emerges as the pivotal technological paradigm underpinning this pursuit.
In this report, we dissect the framework of RLHF, re-evaluate the inner workings of PPO, and explore how the parts comprising PPO algorithms impact policy agent training.
arXiv Detail & Related papers (2023-07-11T01:55:24Z) - SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark
for Semantic and Generative Capabilities [76.97949110580703]
We introduce SUPERB-SG, a new benchmark to evaluate pre-trained models across various speech tasks.
We use a lightweight methodology to test the robustness of representations learned by pre-trained models under shifts in data domain.
We also show that the task diversity of SUPERB-SG coupled with limited task supervision is an effective recipe for evaluating the generalizability of model representation.
arXiv Detail & Related papers (2022-03-14T04:26:40Z) - The Past Mistake is the Future Wisdom: Error-driven Contrastive
Probability Optimization for Chinese Spell Checking [32.8563506271794]
Chinese Spell Checking (CSC) aims to detect and correct Chinese spelling errors.
Pre-trained language models (PLMs) promote the progress of CSC task.
We propose an Error-driven COntrastive Probability Optimization framework for CSC task.
arXiv Detail & Related papers (2022-03-02T09:58:56Z) - NoiER: An Approach for Training more Reliable Fine-TunedDownstream Task
Models [54.184609286094044]
We propose noise entropy regularisation (NoiER) as an efficient learning paradigm that solves the problem without auxiliary models and additional data.
The proposed approach improved traditional OOD detection evaluation metrics by 55% on average compared to the original fine-tuned models.
arXiv Detail & Related papers (2021-08-29T06:58:28Z) - Exploration and Exploitation: Two Ways to Improve Chinese Spelling
Correction Models [51.744357472072416]
We propose a method, which continually identifies the weak spots of a model to generate more valuable training instances.
Experimental results show that such an adversarial training method combined with the pretraining strategy can improve both the generalization and robustness of multiple CSC models.
arXiv Detail & Related papers (2021-05-31T09:17:33Z) - Chinese Grammatical Correction Using BERT-based Pre-trained Model [17.847005759631703]
We verify the effectiveness of two methods that incorporate a BERT-based pre-trained model into an encoder-decoder model on Chinese grammatical error correction tasks.
We also analyze the error type and conclude that sentence-level errors are yet to be addressed.
arXiv Detail & Related papers (2020-11-04T01:23:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.