Learning from the Dictionary: Heterogeneous Knowledge Guided Fine-tuning
for Chinese Spell Checking
- URL: http://arxiv.org/abs/2210.10320v1
- Date: Wed, 19 Oct 2022 06:31:34 GMT
- Title: Learning from the Dictionary: Heterogeneous Knowledge Guided Fine-tuning
for Chinese Spell Checking
- Authors: Yinghui Li, Shirong Ma, Qingyu Zhou, Zhongli Li, Li Yangning, Shulin
Huang, Ruiyang Liu, Chao Li, Yunbo Cao and Haitao Zheng
- Abstract summary: Chinese Spell Checking (CSC) aims to detect and correct Chinese spelling errors.
Recent researches start from the pretrained knowledge of language models and take multimodal information into CSC models to improve the performance.
We propose the LEAD framework, which renders the CSC model to learn heterogeneous knowledge from the dictionary in terms of phonetics, vision, and meaning.
- Score: 32.16787396943434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Chinese Spell Checking (CSC) aims to detect and correct Chinese spelling
errors. Recent researches start from the pretrained knowledge of language
models and take multimodal information into CSC models to improve the
performance. However, they overlook the rich knowledge in the dictionary, the
reference book where one can learn how one character should be pronounced,
written, and used. In this paper, we propose the LEAD framework, which renders
the CSC model to learn heterogeneous knowledge from the dictionary in terms of
phonetics, vision, and meaning. LEAD first constructs positive and negative
samples according to the knowledge of character phonetics, glyphs, and
definitions in the dictionary. Then a unified contrastive learning-based
training scheme is employed to refine the representations of the CSC models.
Extensive experiments and detailed analyses on the SIGHAN benchmark datasets
demonstrate the effectiveness of our proposed methods.
Related papers
- Why do you cite? An investigation on citation intents and decision-making classification processes [1.7812428873698407]
This study emphasizes the importance of trustfully classifying citation intents.
We present a study utilizing advanced Ensemble Strategies for Citation Intent Classification (CIC)
One of our models sets as a new state-of-the-art (SOTA) with an 89.46% Macro-F1 score on the SciCite benchmark.
arXiv Detail & Related papers (2024-07-18T09:29:33Z) - Unifying Structure and Language Semantic for Efficient Contrastive
Knowledge Graph Completion with Structured Entity Anchors [0.3913403111891026]
The goal of knowledge graph completion (KGC) is to predict missing links in a KG using trained facts that are already known.
We propose a novel method to effectively unify structure information and language semantics without losing the power of inductive reasoning.
arXiv Detail & Related papers (2023-11-07T11:17:55Z) - SILC: Improving Vision Language Pretraining with Self-Distillation [113.50400246862056]
We introduce SILC, a novel framework for vision language pretraining.
SILC improves image-text contrastive learning with the simple addition of local-to-global correspondence learning by self-distillation.
We show that distilling local image features from an exponential moving average (EMA) teacher model significantly improves model performance on dense predictions tasks like detection and segmentation.
arXiv Detail & Related papers (2023-10-20T08:44:47Z) - Chinese Spelling Correction as Rephrasing Language Model [63.65217759957206]
We study Chinese Spelling Correction (CSC), which aims to detect and correct the potential spelling errors in a given sentence.
Current state-of-the-art methods regard CSC as a sequence tagging task and fine-tune BERT-based models on sentence pairs.
We propose Rephrasing Language Model (ReLM), where the model is trained to rephrase the entire sentence by infilling additional slots, instead of character-to-character tagging.
arXiv Detail & Related papers (2023-08-17T06:04:28Z) - Commonsense Knowledge Transfer for Pre-trained Language Models [83.01121484432801]
We introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model.
It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model.
It then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction.
arXiv Detail & Related papers (2023-06-04T15:44:51Z) - Investigating Glyph Phonetic Information for Chinese Spell Checking:
What Works and What's Next [48.12125502456953]
We discuss the role of glyph-phonetic information in the Chinese Spell Checking (CSC) task.
We propose a new, more challenging, and practical setting for testing the generalizability of CSC models.
arXiv Detail & Related papers (2022-12-08T04:37:29Z) - Contextual Similarity is More Valuable than Character Similarity:
Curriculum Learning for Chinese Spell Checking [26.93594761258908]
Chinese Spell Checking (CSC) task aims to detect and correct Chinese spelling errors.
To make better use of contextual similarity, we propose a simple yet effective curriculum learning framework for the CSC task.
With the help of our designed model-agnostic framework, existing CSC models will be trained from easy to difficult as humans learn Chinese characters.
arXiv Detail & Related papers (2022-07-17T03:12:27Z) - Improving Pre-trained Language Models with Syntactic Dependency
Prediction Task for Chinese Semantic Error Recognition [52.55136323341319]
Existing Chinese text error detection mainly focuses on spelling and simple grammatical errors.
Chinese semantic errors are understudied and more complex that humans cannot easily recognize.
arXiv Detail & Related papers (2022-04-15T13:55:32Z) - Exploration and Exploitation: Two Ways to Improve Chinese Spelling
Correction Models [51.744357472072416]
We propose a method, which continually identifies the weak spots of a model to generate more valuable training instances.
Experimental results show that such an adversarial training method combined with the pretraining strategy can improve both the generalization and robustness of multiple CSC models.
arXiv Detail & Related papers (2021-05-31T09:17:33Z) - A Closer Look at Linguistic Knowledge in Masked Language Models: The
Case of Relative Clauses in American English [17.993417004424078]
Transformer-based language models achieve high performance on various tasks, but we still lack understanding of the kind of linguistic knowledge they learn and rely on.
We evaluate three models (BERT, RoBERTa, and ALBERT) testing their grammatical and semantic knowledge by sentence-level probing, diagnostic cases, and masked prediction tasks.
arXiv Detail & Related papers (2020-11-02T13:25:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.