On the (In)Effectiveness of Large Language Models for Chinese Text
Correction
- URL: http://arxiv.org/abs/2307.09007v2
- Date: Mon, 11 Dec 2023 12:39:16 GMT
- Title: On the (In)Effectiveness of Large Language Models for Chinese Text
Correction
- Authors: Yinghui Li, Haojing Huang, Shirong Ma, Yong Jiang, Yangning Li, Feng
Zhou, Hai-Tao Zheng, Qingyu Zhou
- Abstract summary: Large Language Models (LLMs) have amazed the entire Artificial Intelligence community.
This study focuses on Chinese Text Correction, a fundamental and challenging Chinese NLP task.
We empirically find that the LLMs currently have both amazing performance and unsatisfactory behavior for Chinese Text Correction.
- Score: 44.32102000125604
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, the development and progress of Large Language Models (LLMs) have
amazed the entire Artificial Intelligence community. Benefiting from their
emergent abilities, LLMs have attracted more and more researchers to study
their capabilities and performance on various downstream Natural Language
Processing (NLP) tasks. While marveling at LLMs' incredible performance on all
kinds of tasks, we notice that they also have excellent multilingual processing
capabilities, such as Chinese. To explore the Chinese processing ability of
LLMs, we focus on Chinese Text Correction, a fundamental and challenging
Chinese NLP task. Specifically, we evaluate various representative LLMs on the
Chinese Grammatical Error Correction (CGEC) and Chinese Spelling Check (CSC)
tasks, which are two main Chinese Text Correction scenarios. Additionally, we
also fine-tune LLMs for Chinese Text Correction to better observe the potential
capabilities of LLMs. From extensive analyses and comparisons with previous
state-of-the-art small models, we empirically find that the LLMs currently have
both amazing performance and unsatisfactory behavior for Chinese Text
Correction. We believe our findings will promote the landing and application of
LLMs in the Chinese NLP community.
Related papers
- Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners [67.85635044939836]
Large Language Models (LLMs) have shown impressive language capabilities.
In this work, we investigate the spontaneous multilingual alignment improvement of LLMs.
We find that LLMs instruction-tuned on the question translation data (i.e. without annotated answers) are able to encourage the alignment between English and a wide range of languages.
arXiv Detail & Related papers (2024-05-22T16:46:19Z) - Are LLMs Effective Backbones for Fine-tuning? An Experimental Investigation of Supervised LLMs on Chinese Short Text Matching [12.213307496643376]
We conduct an experimental analysis by fine-tuning LLMs for the task of Chinese short text matching.
We explore various factors that influence performance when fine-tuning LLMs, including task modeling methods, prompt formats, and output formats.
arXiv Detail & Related papers (2024-03-29T02:36:54Z) - Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models [79.46179534911019]
Large language models (LLMs) have demonstrated multilingual capabilities; yet, they are mostly English-centric due to imbalanced training corpora.
This work extends the evaluation from NLP tasks to real user queries.
For culture-related tasks that need deep language understanding, prompting in the native language tends to be more promising.
arXiv Detail & Related papers (2024-03-15T12:47:39Z) - CIF-Bench: A Chinese Instruction-Following Benchmark for Evaluating the Generalizability of Large Language Models [53.9835961434552]
We introduce the Chinese Instruction-Following Benchmark (CIF-Bench) to evaluate the generalizability of large language models (LLMs) to the Chinese language.
CIF-Bench comprises 150 tasks and 15,000 input-output pairs, developed by native speakers to test complex reasoning and Chinese cultural nuances.
To mitigate data contamination, we release only half of the dataset publicly, with the remainder kept private, and introduce diversified instructions to minimize score variance.
arXiv Detail & Related papers (2024-02-20T16:02:12Z) - Rethinking the Roles of Large Language Models in Chinese Grammatical
Error Correction [62.409807640887834]
Chinese Grammatical Error Correction (CGEC) aims to correct all potential grammatical errors in the input sentences.
LLMs' performance as correctors on CGEC remains unsatisfactory due to its challenging task focus.
We rethink the roles of LLMs in the CGEC task so that they can be better utilized and explored in CGEC.
arXiv Detail & Related papers (2024-02-18T01:40:34Z) - Are Large Language Models Good Fact Checkers: A Preliminary Study [26.023148371263012]
Large Language Models (LLMs) have drawn significant attention due to their outstanding reasoning capabilities and extensive knowledge repository.
This study aims to comprehensively evaluate various LLMs in tackling specific fact-checking subtasks.
arXiv Detail & Related papers (2023-11-29T05:04:52Z) - An Empirical Study of Instruction-tuning Large Language Models in
Chinese [32.5288378307064]
This paper makes an in-depth empirical study of instruction-tuning LLMs in Chinese, which can serve as a cookbook.
Specifically, we systematically explore the impact of LLM bases, parameter-efficient methods, instruction data types.
We also conduct experiment to study the impact of other factors, e.g., chain-of-thought data and human-value alignment.
arXiv Detail & Related papers (2023-10-11T09:18:09Z) - CMMLU: Measuring massive multitask language understanding in Chinese [133.70911295934746]
This paper introduces a comprehensive Chinese benchmark that covers various subjects, including natural science, social sciences, engineering, and humanities.
CMMLU fills the gap in evaluating the knowledge and reasoning capabilities of large language models within the Chinese context.
arXiv Detail & Related papers (2023-06-15T15:49:51Z) - Don't Trust ChatGPT when Your Question is not in English: A Study of
Multilingual Abilities and Types of LLMs [16.770697902481107]
Large Language Models (LLMs) have demonstrated exceptional natural language understanding abilities.
We propose a systematic way of qualifying the performance disparities of LLMs under multilingual settings.
The results show that GPT exhibits highly translating-like behaviour in multilingual settings.
arXiv Detail & Related papers (2023-05-24T02:05:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.