Related papers: A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Correction Based on Large Language Models

A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Correction Based on Large Language Models

URL: http://arxiv.org/abs/2410.04027v1
Date: Sat, 5 Oct 2024 04:06:56 GMT
Title: A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Correction Based on Large Language Models
Authors: Houquan Zhou, Zhenghua Li, Bo Zhang, Chen Li, Shaopeng Lai, Ji Zhang, Fei Huang, Min Zhang,
Abstract summary: This work proposes a simple training-free prompt-free approach to leverage large language models (LLMs) for the Chinese spelling correction (CSC) task. Experiments on five public datasets demonstrate that our approach significantly improves LLM performance.
Score: 39.35525969831397
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work proposes a simple training-free prompt-free approach to leverage large language models (LLMs) for the Chinese spelling correction (CSC) task, which is totally different from all previous CSC approaches. The key idea is to use an LLM as a pure language model in a conventional manner. The LLM goes through the input sentence from the beginning, and at each inference step, produces a distribution over its vocabulary for deciding the next token, given a partial sentence. To ensure that the output sentence remains faithful to the input sentence, we design a minimal distortion model that utilizes pronunciation or shape similarities between the original and replaced characters. Furthermore, we propose two useful reward strategies to address practical challenges specific to the CSC task. Experiments on five public datasets demonstrate that our approach significantly improves LLM performance, enabling them to compete with state-of-the-art domain-general CSC models.

Related papers

Surprise Calibration for Better In-Context Learning [6.566285172635043]
In-context learning (ICL) has emerged as a powerful paradigm for task adaptation in large language models.<n>Existing bias calibration methods apply fixed class priors across all inputs, limiting their efficacy in dynamic ICL settings.<n>We introduce a novel method-Surprise (SC), which captures the temporal dynamics of class priors.
arXiv Detail & Related papers (2025-06-15T10:04:42Z)
LLMCL-GEC: Advancing Grammatical Error Correction with LLM-Driven Curriculum Learning [44.010834543396165]
Large-scale language models (LLMs) have demonstrated remarkable capabilities in specific natural language processing (NLP) tasks. However, they may still lack proficiency compared to specialized models in certain domains, such as grammatical error correction (GEC)
arXiv Detail & Related papers (2024-12-17T05:09:07Z)
LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation [72.02635550088546]
This work explores how large language models (LLMs) can enhance CLIP's capability, especially for processing longer and more complex image captions.<n>We introduce a caption-to-caption contrastive fine-tuning framework, significantly enhancing the discriminative quality of LLM outputs.<n>Our approach outperforms LoRA-based methods, achieving nearly fourfold faster training with superior performance.
arXiv Detail & Related papers (2024-11-07T18:59:16Z)
Prefix Text as a Yarn: Eliciting Non-English Alignment in Foundation Language Model [50.339632513018934]
supervised fine-tuning (SFT) has been a straightforward approach for tailoring the output of foundation large language model (LLM) to specific preferences. We critically examine this hypothesis within the scope of cross-lingual generation tasks. We introduce a novel training-free alignment method named PreTTY, which employs minimal task-related prior tokens.
arXiv Detail & Related papers (2024-04-25T17:19:36Z)
L-TUNING: Synchronized Label Tuning for Prompt and Prefix in LLMs [0.0]
This paper introduces L-Tuning, an efficient fine-tuning approach for classification tasks within the Natural Language Inference (NLI) framework. L-Tuning focuses on the fine-tuning of label tokens processed through a pre-trained Large Language Models (LLMs) Our experimental results indicate a significant improvement in training efficiency and classification accuracy with L-Tuning compared to traditional approaches.
arXiv Detail & Related papers (2023-12-21T01:47:49Z)
Bridging the Gap between Language Models and Cross-Lingual Sequence Labeling [101.74165219364264]
Large-scale cross-lingual pre-trained language models (xPLMs) have shown effectiveness in cross-lingual sequence labeling tasks. Despite the great success, we draw an empirical observation that there is a training objective gap between pre-training and fine-tuning stages. In this paper, we first design a pre-training task tailored for xSL named Cross-lingual Language Informative Span Masking (CLISM) to eliminate the objective gap. Second, we present ContrAstive-Consistency Regularization (CACR), which utilizes contrastive learning to encourage the consistency between representations of input parallel
arXiv Detail & Related papers (2022-04-11T15:55:20Z)
Auto-MLM: Improved Contrastive Learning for Self-supervised Multi-lingual Knowledge Retrieval [7.73633850933515]
We introduce a joint training method by combining CL and Auto-MLM for self-supervised multi-lingual knowledge retrieval. Experimental results show that our proposed approach consistently outperforms all the previous SOTA methods on both $&$ LAZADA service corpus and openly available corpora in 8 languages.
arXiv Detail & Related papers (2022-03-30T10:13:57Z)
LICHEE: Improving Language Model Pre-training with Multi-grained Tokenization [19.89228774074371]
We propose a simple yet effective pre-training method named LICHEE to efficiently incorporate multi-grained information of input text. Our method can be applied to various pre-trained language models and improve their representation capability.
arXiv Detail & Related papers (2021-08-02T12:08:19Z)
Exploration and Exploitation: Two Ways to Improve Chinese Spelling Correction Models [51.744357472072416]
We propose a method, which continually identifies the weak spots of a model to generate more valuable training instances. Experimental results show that such an adversarial training method combined with the pretraining strategy can improve both the generalization and robustness of multiple CSC models.
arXiv Detail & Related papers (2021-05-31T09:17:33Z)
COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining [59.169836983883656]
COCO-LM is a new self-supervised learning framework that pretrains Language Models by COrrecting challenging errors and COntrasting text sequences. COCO-LM employs an auxiliary language model to mask-and-predict tokens in original text sequences. Our analyses reveal that COCO-LM's advantages come from its challenging training signals, more contextualized token representations, and regularized sequence representations.
arXiv Detail & Related papers (2021-02-16T22:24:29Z)
SLM: Learning a Discourse Language Representation with Sentence Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation. We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.