A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Correction Based on Large Language Models
- URL: http://arxiv.org/abs/2410.04027v1
- Date: Sat, 5 Oct 2024 04:06:56 GMT
- Title: A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Correction Based on Large Language Models
- Authors: Houquan Zhou, Zhenghua Li, Bo Zhang, Chen Li, Shaopeng Lai, Ji Zhang, Fei Huang, Min Zhang,
- Abstract summary: This work proposes a simple training-free prompt-free approach to leverage large language models (LLMs) for the Chinese spelling correction (CSC) task.
Experiments on five public datasets demonstrate that our approach significantly improves LLM performance.
- Score: 39.35525969831397
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work proposes a simple training-free prompt-free approach to leverage large language models (LLMs) for the Chinese spelling correction (CSC) task, which is totally different from all previous CSC approaches. The key idea is to use an LLM as a pure language model in a conventional manner. The LLM goes through the input sentence from the beginning, and at each inference step, produces a distribution over its vocabulary for deciding the next token, given a partial sentence. To ensure that the output sentence remains faithful to the input sentence, we design a minimal distortion model that utilizes pronunciation or shape similarities between the original and replaced characters. Furthermore, we propose two useful reward strategies to address practical challenges specific to the CSC task. Experiments on five public datasets demonstrate that our approach significantly improves LLM performance, enabling them to compete with state-of-the-art domain-general CSC models.
Related papers
- Prefix Text as a Yarn: Eliciting Non-English Alignment in Foundation Language Model [50.339632513018934]
supervised fine-tuning (SFT) has been a straightforward approach for tailoring the output of foundation large language model (LLM) to specific preferences.
We critically examine this hypothesis within the scope of cross-lingual generation tasks.
We introduce a novel training-free alignment method named PreTTY, which employs minimal task-related prior tokens.
arXiv Detail & Related papers (2024-04-25T17:19:36Z) - L-TUNING: Synchronized Label Tuning for Prompt and Prefix in LLMs [0.0]
This paper introduces L-Tuning, an efficient fine-tuning approach for classification tasks within the Natural Language Inference (NLI) framework.
L-Tuning focuses on the fine-tuning of label tokens processed through a pre-trained Large Language Models (LLMs)
Our experimental results indicate a significant improvement in training efficiency and classification accuracy with L-Tuning compared to traditional approaches.
arXiv Detail & Related papers (2023-12-21T01:47:49Z) - Bridging the Gap between Language Models and Cross-Lingual Sequence
Labeling [101.74165219364264]
Large-scale cross-lingual pre-trained language models (xPLMs) have shown effectiveness in cross-lingual sequence labeling tasks.
Despite the great success, we draw an empirical observation that there is a training objective gap between pre-training and fine-tuning stages.
In this paper, we first design a pre-training task tailored for xSL named Cross-lingual Language Informative Span Masking (CLISM) to eliminate the objective gap.
Second, we present ContrAstive-Consistency Regularization (CACR), which utilizes contrastive learning to encourage the consistency between representations of input parallel
arXiv Detail & Related papers (2022-04-11T15:55:20Z) - Auto-MLM: Improved Contrastive Learning for Self-supervised
Multi-lingual Knowledge Retrieval [7.73633850933515]
We introduce a joint training method by combining CL and Auto-MLM for self-supervised multi-lingual knowledge retrieval.
Experimental results show that our proposed approach consistently outperforms all the previous SOTA methods on both $&$ LAZADA service corpus and openly available corpora in 8 languages.
arXiv Detail & Related papers (2022-03-30T10:13:57Z) - LICHEE: Improving Language Model Pre-training with Multi-grained
Tokenization [19.89228774074371]
We propose a simple yet effective pre-training method named LICHEE to efficiently incorporate multi-grained information of input text.
Our method can be applied to various pre-trained language models and improve their representation capability.
arXiv Detail & Related papers (2021-08-02T12:08:19Z) - Exploration and Exploitation: Two Ways to Improve Chinese Spelling
Correction Models [51.744357472072416]
We propose a method, which continually identifies the weak spots of a model to generate more valuable training instances.
Experimental results show that such an adversarial training method combined with the pretraining strategy can improve both the generalization and robustness of multiple CSC models.
arXiv Detail & Related papers (2021-05-31T09:17:33Z) - COCO-LM: Correcting and Contrasting Text Sequences for Language Model
Pretraining [59.169836983883656]
COCO-LM is a new self-supervised learning framework that pretrains Language Models by COrrecting challenging errors and COntrasting text sequences.
COCO-LM employs an auxiliary language model to mask-and-predict tokens in original text sequences.
Our analyses reveal that COCO-LM's advantages come from its challenging training signals, more contextualized token representations, and regularized sequence representations.
arXiv Detail & Related papers (2021-02-16T22:24:29Z) - SLM: Learning a Discourse Language Representation with Sentence
Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation.
We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.