Related papers: Chinese Character Recognition with Radical-Structured Stroke Trees

Chinese Character Recognition with Radical-Structured Stroke Trees

URL: http://arxiv.org/abs/2211.13518v1
Date: Thu, 24 Nov 2022 10:28:55 GMT
Title: Chinese Character Recognition with Radical-Structured Stroke Trees
Authors: Haiyang Yu, Jingye Chen, Bin Li, Xiangyang Xue
Abstract summary: We represent each Chinese character as a stroke tree, which is organized according to its radical structures. We propose a two-stage decomposition framework, where a Feature-to-Radical Decoder perceives radical structures and radical regions. A Radical-to-Stroke Decoder further predicts the stroke sequences according to the features of radical regions.
Score: 51.8541677234175
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The flourishing blossom of deep learning has witnessed the rapid development of Chinese character recognition. However, it remains a great challenge that the characters for testing may have different distributions from those of the training dataset. Existing methods based on a single-level representation (character-level, radical-level, or stroke-level) may be either too sensitive to distribution changes (e.g., induced by blurring, occlusion, and zero-shot problems) or too tolerant to one-to-many ambiguities. In this paper, we represent each Chinese character as a stroke tree, which is organized according to its radical structures, to fully exploit the merits of both radical and stroke levels in a decent way. We propose a two-stage decomposition framework, where a Feature-to-Radical Decoder perceives radical structures and radical regions, and a Radical-to-Stroke Decoder further predicts the stroke sequences according to the features of radical regions. The generated radical structures and stroke sequences are encoded as a Radical-Structured Stroke Tree (RSST), which is fed to a Tree-to-Character Translator based on the proposed Weighted Edit Distance to match the closest candidate character in the RSST lexicon. Our extensive experimental results demonstrate that the proposed method outperforms the state-of-the-art single-level methods by increasing margins as the distribution difference becomes more severe in the blurring, occlusion, and zero-shot scenarios, which indeed validates the robustness of the proposed method.

Related papers

Zero-Shot Chinese Character Recognition with Hierarchical Multi-Granularity Image-Text Aligning [52.92837273570818]
Chinese characters exhibit unique structures and compositional rules, allowing for the use of fine-grained semantic information in representation.<n>We propose a Hierarchical Multi-Granularity Image-Text Aligning (Hi-GITA) framework based on a contrastive paradigm.<n>Our proposed Hi-GITA outperforms existing zero-shot CCR methods.
arXiv Detail & Related papers (2025-05-30T17:39:14Z)
Character-Level Chinese Dependency Parsing via Modeling Latent Intra-Word Structure [11.184330703168893]
This paper proposes modeling latent internal structures within words in Chinese. A constrained Eisner algorithm is implemented to ensure the compatibility of character-level trees. A detailed analysis reveals that a coarse-to-fine parsing strategy empowers the model to predict more linguistically plausible intra-word structures.
arXiv Detail & Related papers (2024-06-06T06:23:02Z)
Semi-Supervised Unconstrained Head Pose Estimation in the Wild [60.08319512840091]
We propose the first semi-supervised unconstrained head pose estimation method SemiUHPE. Our method is based on the observation that the aspect-ratio invariant cropping of wild heads is superior to the previous landmark-based affine alignment. Experiments and ablation studies show that SemiUHPE outperforms existing methods greatly on public benchmarks.
arXiv Detail & Related papers (2024-04-03T08:01:00Z)
Provably Secure Disambiguating Neural Linguistic Steganography [66.30965740387047]
The segmentation ambiguity problem, which arises when using language models based on subwords, leads to occasional decoding failures. We propose a novel secure disambiguation method named SyncPool, which effectively addresses the segmentation ambiguity problem. SyncPool does not change the size of the candidate pool or the distribution of tokens and thus is applicable to provably secure language steganography methods.
arXiv Detail & Related papers (2024-03-26T09:25:57Z)
Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic [51.967603572656266]
We introduce a consistent and theoretically grounded approach to annotating decompositional entailment. We find that our new dataset, RDTE, has a substantially higher internal consistency (+9%) than prior decompositional entailment datasets. We also find that training an RDTE-oriented entailment classifier via knowledge distillation and employing it in an entailment tree reasoning engine significantly improves both accuracy and proof quality.
arXiv Detail & Related papers (2024-02-22T18:55:17Z)
Toward Zero-shot Character Recognition: A Gold Standard Dataset with Radical-level Annotations [5.761679637905164]
In this paper, we construct an ancient Chinese character image dataset that contains both radical-level and character-level annotations. To increase the adaptability of ACCID, we propose a splicing-based synthetic character algorithm to augment the training samples and apply an image denoising method to improve the image quality.
arXiv Detail & Related papers (2023-08-01T16:41:30Z)
Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks [39.51297217854375]
We propose Text-CRS, a certified robustness framework for natural language processing (NLP) based on randomized smoothing. We show that Text-CRS can address all four different word-level adversarial operations and achieve a significant accuracy improvement. We also provide the first benchmark on certified accuracy and radius of four word-level operations, besides outperforming the state-of-the-art certification against synonym substitution attacks.
arXiv Detail & Related papers (2023-07-31T13:08:16Z)
Learning Generative Structure Prior for Blind Text Image Super-resolution [153.05759524358467]
We present a novel prior that focuses more on the character structure. To restrict the generative space of StyleGAN, we store the discrete features for each character in a codebook. The proposed structure prior exerts stronger character-specific guidance to restore faithful and precise strokes of a designated character.
arXiv Detail & Related papers (2023-03-26T13:54:28Z)
STAR: Zero-Shot Chinese Character Recognition with Stroke- and Radical-Level Decompositions [14.770409889132539]
We propose an effective zero-shot Chinese character recognition method by combining stroke- and radical-level decompositions. Numerical results show that the proposed method outperforms the state-of-the-art methods in both character and radical zero-shot settings.
arXiv Detail & Related papers (2022-10-16T08:57:46Z)
A Character-level Span-based Model for Mandarin Prosodic Structure Prediction [36.90699361223442]
We propose a span-based Mandarin prosodic structure prediction model to obtain an optimal prosodic structure tree. Rich linguistic features are provided by Chinese character-level BERT and sent to encoder with self-attention architecture. The proposed method can predict prosodic labels of different levels at the same time and accomplish the process directly from Chinese characters.
arXiv Detail & Related papers (2022-03-31T09:47:08Z)
Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation. This paper aims to address the issue with a mask-and-predict strategy. We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions. Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z)
MGD-GAN: Text-to-Pedestrian generation through Multi-Grained Discrimination [96.91091607251526]
We propose the Multi-Grained Discrimination enhanced Generative Adversarial Network, that capitalizes a human-part-based Discriminator and a self-cross-attended Discriminator. A fine-grained word-level attention mechanism is employed in the HPD module to enforce diversified appearance and vivid details. The substantial improvement over the various metrics demonstrates the efficacy of MGD-GAN on the text-to-pedestrian synthesis scenario.
arXiv Detail & Related papers (2020-10-02T12:24:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.