STAR: Zero-Shot Chinese Character Recognition with Stroke- and
Radical-Level Decompositions
- URL: http://arxiv.org/abs/2210.08490v1
- Date: Sun, 16 Oct 2022 08:57:46 GMT
- Title: STAR: Zero-Shot Chinese Character Recognition with Stroke- and
Radical-Level Decompositions
- Authors: Jinshan Zeng, Ruiying Xu, Yu Wu, Hongwei Li, Jiaxing Lu
- Abstract summary: We propose an effective zero-shot Chinese character recognition method by combining stroke- and radical-level decompositions.
Numerical results show that the proposed method outperforms the state-of-the-art methods in both character and radical zero-shot settings.
- Score: 14.770409889132539
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-shot Chinese character recognition has attracted rising attention in
recent years. Existing methods for this problem are mainly based on either
certain low-level stroke-based decomposition or medium-level radical-based
decomposition. Considering that the stroke- and radical-level decompositions
can provide different levels of information, we propose an effective zero-shot
Chinese character recognition method by combining them. The proposed method
consists of a training stage and an inference stage. In the training stage, we
adopt two similar encoder-decoder models to yield the estimates of stroke and
radical encodings, which together with the true encodings are then used to
formalize the associated stroke and radical losses for training. A similarity
loss is introduced to regularize stroke and radical encoders to yield features
of the same characters with high correlation. In the inference stage, two key
modules, i.e., the stroke screening module (SSM) and feature matching module
(FMM) are introduced to tackle the deterministic and confusing cases
respectively. In particular, we introduce an effective stroke rectification
scheme in FMM to enlarge the candidate set of characters for final inference.
Numerous experiments over three benchmark datasets covering the handwritten,
printed artistic and street view scenarios are conducted to demonstrate the
effectiveness of the proposed method. Numerical results show that the proposed
method outperforms the state-of-the-art methods in both character and radical
zero-shot settings, and maintains competitive performance in the traditional
seen character setting.
Related papers
- Improving Sampling Methods for Fine-tuning SentenceBERT in Text Streams [49.3179290313959]
This study explores the efficacy of seven text sampling methods designed to selectively fine-tune language models.
We precisely assess the impact of these methods on fine-tuning the SBERT model using four different loss functions.
Our findings indicate that Softmax loss and Batch All Triplets loss are particularly effective for text stream classification.
arXiv Detail & Related papers (2024-03-18T23:41:52Z) - Linguistic-Based Mild Cognitive Impairment Detection Using Informative
Loss [2.8893654860442872]
We propose a framework that analyzes transcripts generated from video interviews collected within the I-CONECT study project.
Our framework can distinguish between MCI and NC with an average area under the curve of 84.75%.
arXiv Detail & Related papers (2024-01-23T16:30:22Z) - Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through
Image-IDS Aligning [61.34060587461462]
We propose a two-stage framework for Chinese Text Recognition (CTR)
We pre-train a CLIP-like model through aligning printed character images and Ideographic Description Sequences (IDS)
This pre-training stage simulates humans recognizing Chinese characters and obtains the canonical representation of each character.
The learned representations are employed to supervise the CTR model, such that traditional single-character recognition can be improved to text-line recognition.
arXiv Detail & Related papers (2023-09-03T05:33:16Z) - Toward Zero-shot Character Recognition: A Gold Standard Dataset with
Radical-level Annotations [5.761679637905164]
In this paper, we construct an ancient Chinese character image dataset that contains both radical-level and character-level annotations.
To increase the adaptability of ACCID, we propose a splicing-based synthetic character algorithm to augment the training samples and apply an image denoising method to improve the image quality.
arXiv Detail & Related papers (2023-08-01T16:41:30Z) - A Novel Plagiarism Detection Approach Combining BERT-based Word
Embedding, Attention-based LSTMs and an Improved Differential Evolution
Algorithm [11.142354615369273]
We propose a novel method for detecting plagiarism based on attention mechanism-based long short-term memory (LSTM) and bidirectional encoder representations from transformers (BERT) word embedding.
BERT could be included in a downstream task and fine-tuned as a task-specific structure, while the trained BERT model is capable of detecting various linguistic characteristics.
arXiv Detail & Related papers (2023-05-03T18:26:47Z) - Chinese Character Recognition with Radical-Structured Stroke Trees [51.8541677234175]
We represent each Chinese character as a stroke tree, which is organized according to its radical structures.
We propose a two-stage decomposition framework, where a Feature-to-Radical Decoder perceives radical structures and radical regions.
A Radical-to-Stroke Decoder further predicts the stroke sequences according to the features of radical regions.
arXiv Detail & Related papers (2022-11-24T10:28:55Z) - Asymmetric Modality Translation For Face Presentation Attack Detection [55.09300842243827]
Face presentation attack detection (PAD) is an essential measure to protect face recognition systems from being spoofed by malicious users.
We propose a novel framework based on asymmetric modality translation forPAD in bi-modality scenarios.
Our method achieves state-of-the-art performance under different evaluation protocols.
arXiv Detail & Related papers (2021-10-18T08:59:09Z) - Partner-Assisted Learning for Few-Shot Image Classification [54.66864961784989]
Few-shot Learning has been studied to mimic human visual capabilities and learn effective models without the need of exhaustive human annotation.
In this paper, we focus on the design of training strategy to obtain an elemental representation such that the prototype of each novel class can be estimated from a few labeled samples.
We propose a two-stage training scheme, which first trains a partner encoder to model pair-wise similarities and extract features serving as soft-anchors, and then trains a main encoder by aligning its outputs with soft-anchors while attempting to maximize classification performance.
arXiv Detail & Related papers (2021-09-15T22:46:19Z) - Zero-Shot Chinese Character Recognition with Stroke-Level Decomposition [37.808021793372504]
We propose a stroke-based method by decomposing each character into a sequence of strokes.
We employ a matching-based strategy to transform the predicted stroke sequence to a specific character.
The proposed method can be easily generalized to other languages whose characters can be decomposed into strokes.
arXiv Detail & Related papers (2021-06-22T08:49:03Z) - Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence
Lip-Reading [96.48553941812366]
Lip-reading aims to infer the speech content from the lip movement sequence.
Traditional learning process of seq2seq models suffers from two problems.
We propose a novel pseudo-convolutional policy gradient (PCPG) based method to address these two problems.
arXiv Detail & Related papers (2020-03-09T09:12:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.