Robust End-to-End Offline Chinese Handwriting Text Page Spotter with
Text Kernel
- URL: http://arxiv.org/abs/2107.01547v1
- Date: Sun, 4 Jul 2021 05:42:04 GMT
- Title: Robust End-to-End Offline Chinese Handwriting Text Page Spotter with
Text Kernel
- Authors: Zhihao Wang, Yanwei Yu, Yibo Wang, Haixu Long, and Fazheng Wang
- Abstract summary: We propose a robust end-to-end Chinese text page spotter framework.
It unifies text detection and text recognition with text kernel.
Our method achieves state-of-the-art results on the CASIA-HWDB2.0-2.2 dataset and ICDAR-2013 competition dataset.
- Score: 4.028854207195064
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Offline Chinese handwriting text recognition is a long-standing research
topic in the field of pattern recognition. In previous studies, text detection
and recognition are separated, which leads to the fact that text recognition is
highly dependent on the detection results. In this paper, we propose a robust
end-to-end Chinese text page spotter framework. It unifies text detection and
text recognition with text kernel that integrates global text feature
information to optimize the recognition from multiple scales, which reduces the
dependence of detection and improves the robustness of the system. Our method
achieves state-of-the-art results on the CASIA-HWDB2.0-2.2 dataset and
ICDAR-2013 competition dataset. Without any language model, the correct rates
are 99.12% and 94.27% for line-level recognition, and 99.03% and 94.20% for
page-level recognition, respectively.
Related papers
- GatedLexiconNet: A Comprehensive End-to-End Handwritten Paragraph Text Recognition System [3.9527064697847005]
We present an end-to-end paragraph recognition system that incorporates internal line segmentation and convolutional layers based encoder.
This study reported character error rates of 2.27% on IAM, 0.9% on RIMES, and 2.13% on READ-16, and word error rates of 5.73% on READ-2016 datasets.
arXiv Detail & Related papers (2024-04-22T10:19:16Z) - Efficiently Leveraging Linguistic Priors for Scene Text Spotting [63.22351047545888]
This paper proposes a method that leverages linguistic knowledge from a large text corpus to replace the traditional one-hot encoding used in auto-regressive scene text spotting and recognition models.
We generate text distributions that align well with scene text datasets, removing the need for in-domain fine-tuning.
Experimental results show that our method not only improves recognition accuracy but also enables more accurate localization of words.
arXiv Detail & Related papers (2024-02-27T01:57:09Z) - SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting [126.01629300244001]
We propose a new end-to-end scene text spotting framework termed SwinTextSpotter v2.
We enhance the relationship between two tasks using novel Recognition Conversion and Recognition Alignment modules.
SwinTextSpotter v2 achieved state-of-the-art performance on various multilingual (English, Chinese, and Vietnamese) benchmarks.
arXiv Detail & Related papers (2024-01-15T12:33:00Z) - Orientation-Independent Chinese Text Recognition in Scene Images [61.34060587461462]
We take the first attempt to extract orientation-independent visual features by disentangling content and orientation information of text images.
Specifically, we introduce a Character Image Reconstruction Network (CIRN) to recover corresponding printed character images with disentangled content and orientation information.
arXiv Detail & Related papers (2023-09-03T05:30:21Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - Optimal Boxes: Boosting End-to-End Scene Text Recognition by Adjusting
Annotated Bounding Boxes via Reinforcement Learning [41.56134008044702]
Box is a reinforcement learning-based method for adjusting the shape of each text bounding box to make it more compatible with text recognition models.
Experiments demonstrate that the performance of end-to-end text recognition systems can be improved when using the adjusted bounding boxes as the ground truths for training.
arXiv Detail & Related papers (2022-07-25T06:58:45Z) - SwinTextSpotter: Scene Text Spotting via Better Synergy between Text
Detection and Text Recognition [73.61592015908353]
We propose a new end-to-end scene text spotting framework termed SwinTextSpotter.
Using a transformer with dynamic head as the detector, we unify the two tasks with a novel Recognition Conversion mechanism.
The design results in a concise framework that requires neither additional rectification module nor character-level annotation.
arXiv Detail & Related papers (2022-03-19T01:14:42Z) - Benchmarking Chinese Text Recognition: Datasets, Baselines, and an
Empirical Study [25.609450020149637]
Existing text recognition methods are mainly for English texts, whereas ignoring the pivotal role of Chinese texts.
We manually collect Chinese text datasets from publicly available competitions, projects, and papers, then divide them into four categories including scene, web, document, and handwriting datasets.
By analyzing the experimental results, we surprisingly observe that state-of-the-art baselines for recognizing English texts cannot perform well on Chinese scenarios.
arXiv Detail & Related papers (2021-12-30T15:30:52Z) - ARTS: Eliminating Inconsistency between Text Detection and Recognition
with Auto-Rectification Text Spotter [37.86206423441885]
We present a simple yet robust end-to-end text spotting framework, termed Auto-Rectification Text Spotter (ARTS)
Our method achieves 77.1% end-to-end text spotting F-measure on Total-Text at a competitive speed of 10.5 FPS.
arXiv Detail & Related papers (2021-10-20T06:53:44Z) - Implicit Feature Alignment: Learn to Convert Text Recognizer to Text
Spotter [38.4211220941874]
We propose a simple, elegant and effective paradigm called Implicit Feature Alignment (IFA)
IFA can be easily integrated into current text recognizers, resulting in a novel inference mechanism called IFAinference.
We experimentally demonstrate that IFA achieves state-of-the-art performance on end-to-end document recognition tasks.
arXiv Detail & Related papers (2021-06-10T17:06:28Z) - Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting [49.768327669098674]
We propose an end-to-end trainable text spotting approach named Text Perceptron.
It first employs an efficient segmentation-based text detector that learns the latent text reading order and boundary information.
Then a novel Shape Transform Module (abbr. STM) is designed to transform the detected feature regions into regular morphologies.
arXiv Detail & Related papers (2020-02-17T08:07:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.