Sentence-level Online Handwritten Chinese Character Recognition
- URL: http://arxiv.org/abs/2108.02561v1
- Date: Sun, 4 Jul 2021 14:26:06 GMT
- Title: Sentence-level Online Handwritten Chinese Character Recognition
- Authors: Yunxin Li, Qian Yang, Qingcai Chen, Lin Ma, Baotian Hu, Xiaolong Wang,
Yuxin Ding
- Abstract summary: Single online handwritten Chinese character recognition(single OLHCCR) has achieved prominent performance.
In real application scenarios, users always write multiple Chinese characters to form one complete sentence.
We propose a simple and straightforward end-to-end network, namely vanilla compositional network(VCN) to tackle the sentence-level OLHCCR.
We also propose a novel deep spatial-temporal fusion network(DSTFN) to improve the robustness of sentence-level OLHCCR.
- Score: 36.57575120082676
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Single online handwritten Chinese character recognition~(single OLHCCR) has
achieved prominent performance. However, in real application scenarios, users
always write multiple Chinese characters to form one complete sentence and the
contextual information within these characters holds the significant potential
to improve the accuracy, robustness and efficiency of sentence-level OLHCCR. In
this work, we first propose a simple and straightforward end-to-end network,
namely vanilla compositional network~(VCN) to tackle the sentence-level OLHCCR.
It couples convolutional neural network with sequence modeling architecture to
exploit the handwritten character's previous contextual information. Although
VCN performs much better than the state-of-the-art single OLHCCR model, it
exposes high fragility when confronting with not well written characters such
as sloppy writing, missing or broken strokes. To improve the robustness of
sentence-level OLHCCR, we further propose a novel deep spatial-temporal fusion
network~(DSTFN). It utilizes a pre-trained autoregresssive framework as the
backbone component, which projects each Chinese character into word embeddings,
and integrates the spatial glyph features of handwritten characters and their
contextual information multiple times at multi-layer fusion module. We also
construct a large-scale sentence-level handwriting dataset, named as CSOHD to
evaluate models. Extensive experiment results demonstrate that DSTFN achieves
the state-of-the-art performance, which presents strong robustness compared
with VCN and exiting single OLHCCR models. The in-depth empirical analysis and
case studies indicate that DSTFN can significantly improve the efficiency of
handwriting input, with the handwritten Chinese character with incomplete
strokes being recognized precisely.
Related papers
- Detecting Document-level Paraphrased Machine Generated Content: Mimicking Human Writing Style and Involving Discourse Features [57.34477506004105]
Machine-generated content poses challenges such as academic plagiarism and the spread of misinformation.
We introduce novel methodologies and datasets to overcome these challenges.
We propose MhBART, an encoder-decoder model designed to emulate human writing style.
We also propose DTransformer, a model that integrates discourse analysis through PDTB preprocessing to encode structural features.
arXiv Detail & Related papers (2024-12-17T08:47:41Z) - Online Writer Retrieval with Chinese Handwritten Phrases: A Synergistic Temporal-Frequency Representation Learning Approach [35.50318959678818]
We propose DOLPHIN, a novel retrieval model designed to enhance handwriting representations through synergistic temporal-frequency analysis.
We introduce OLIWER, a large-scale online writer retrieval dataset encompassing over 670,000 Chinese handwritten phrases from 1,731 individuals.
Our findings emphasize the significance of point sampling frequency and pressure features in improving handwriting representation quality.
arXiv Detail & Related papers (2024-12-16T11:19:22Z) - Classification of Non-native Handwritten Characters Using Convolutional Neural Network [0.0]
The classification of English characters written by non-native users is performed by proposing a custom-tailored CNN model.
We train this CNN with a new dataset called the handwritten isolated English character dataset.
The proposed model with five convolutional layers and one hidden layer outperforms state-of-the-art models in terms of character recognition accuracy.
arXiv Detail & Related papers (2024-06-06T21:08:07Z) - MetaScript: Few-Shot Handwritten Chinese Content Generation via
Generative Adversarial Networks [15.037121719502606]
We propose MetaScript, a novel content generation system designed to address the diminishing presence of personal handwriting styles in the digital representation of Chinese characters.
Our approach harnesses the power of few-shot learning to generate Chinese characters that retain the individual's unique handwriting style and maintain the efficiency of digital typing.
arXiv Detail & Related papers (2023-12-25T17:31:19Z) - Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through
Image-IDS Aligning [61.34060587461462]
We propose a two-stage framework for Chinese Text Recognition (CTR)
We pre-train a CLIP-like model through aligning printed character images and Ideographic Description Sequences (IDS)
This pre-training stage simulates humans recognizing Chinese characters and obtains the canonical representation of each character.
The learned representations are employed to supervise the CTR model, such that traditional single-character recognition can be improved to text-line recognition.
arXiv Detail & Related papers (2023-09-03T05:33:16Z) - Chinese Financial Text Emotion Mining: GCGTS -- A Character
Relationship-based Approach for Simultaneous Aspect-Opinion Pair Extraction [7.484918031250864]
Aspect-Opinion Pair Extraction (AOPE) from Chinese financial texts is a specialized task in fine-grained text sentiment analysis.
Previous studies have mainly focused on developing grid annotation schemes within grid-based models to facilitate this extraction process.
We propose a novel method called Graph-based Character-level Grid Tagging Scheme (GCGTS)
The GCGTS method explicitly incorporates syntactic structure using Graph Convolutional Networks (GCN) and unifies the encoding of characters within the same semantic unit (Chinese word level)
arXiv Detail & Related papers (2023-08-04T02:20:56Z) - Cross-modality Data Augmentation for End-to-End Sign Language Translation [66.46877279084083]
End-to-end sign language translation (SLT) aims to convert sign language videos into spoken language texts directly without intermediate representations.
It has been a challenging task due to the modality gap between sign videos and texts and the data scarcity of labeled data.
We propose a novel Cross-modality Data Augmentation (XmDA) framework to transfer the powerful gloss-to-text translation capabilities to end-to-end sign language translation.
arXiv Detail & Related papers (2023-05-18T16:34:18Z) - Letter-level Online Writer Identification [86.13203975836556]
We focus on a novel problem, letter-level online writer-id, which requires only a few trajectories of written letters as identification cues.
A main challenge is that a person often writes a letter in different styles from time to time.
We refer to this problem as the variance of online writing styles (Var-O-Styles)
arXiv Detail & Related papers (2021-12-06T07:21:53Z) - PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering
Network [54.03560668182197]
We propose a novel fully convolutional Point Gathering Network (PGNet) for reading arbitrarily-shaped text in real-time.
With a PG-CTC decoder, we gather high-level character classification vectors from two-dimensional space and decode them into text symbols without NMS and RoI operations.
Experiments prove that the proposed method achieves competitive accuracy, meanwhile significantly improving the running speed.
arXiv Detail & Related papers (2021-04-12T13:27:34Z) - Offline Handwritten Chinese Text Recognition with Convolutional Neural
Networks [5.984124397831814]
In this paper, we build the models using only the convolutional neural networks and use CTC as the loss function.
We achieve 6.81% character error rate (CER) on the ICDAR 2013 competition set, which is the best published result without language model correction.
arXiv Detail & Related papers (2020-06-28T14:34:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.