PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering
Network
- URL: http://arxiv.org/abs/2104.05458v1
- Date: Mon, 12 Apr 2021 13:27:34 GMT
- Title: PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering
Network
- Authors: Pengfei Wang, Chengquan Zhang, Fei Qi, Shanshan Liu, Xiaoqiang Zhang,
Pengyuan Lyu, Junyu Han, Jingtuo Liu, Errui Ding, Guangming Shi
- Abstract summary: We propose a novel fully convolutional Point Gathering Network (PGNet) for reading arbitrarily-shaped text in real-time.
With a PG-CTC decoder, we gather high-level character classification vectors from two-dimensional space and decode them into text symbols without NMS and RoI operations.
Experiments prove that the proposed method achieves competitive accuracy, meanwhile significantly improving the running speed.
- Score: 54.03560668182197
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The reading of arbitrarily-shaped text has received increasing research
attention. However, existing text spotters are mostly built on two-stage
frameworks or character-based methods, which suffer from either Non-Maximum
Suppression (NMS), Region-of-Interest (RoI) operations, or character-level
annotations. In this paper, to address the above problems, we propose a novel
fully convolutional Point Gathering Network (PGNet) for reading
arbitrarily-shaped text in real-time. The PGNet is a single-shot text spotter,
where the pixel-level character classification map is learned with proposed
PG-CTC loss avoiding the usage of character-level annotations. With a PG-CTC
decoder, we gather high-level character classification vectors from
two-dimensional space and decode them into text symbols without NMS and RoI
operations involved, which guarantees high efficiency. Additionally, reasoning
the relations between each character and its neighbors, a graph refinement
module (GRM) is proposed to optimize the coarse recognition and improve the
end-to-end performance. Experiments prove that the proposed method achieves
competitive accuracy, meanwhile significantly improving the running speed. In
particular, in Total-Text, it runs at 46.7 FPS, surpassing the previous
spotters with a large margin.
Related papers
- Adaptive Segmentation Network for Scene Text Detection [0.0]
We propose to automatically learn the discriminate segmentation threshold, which distinguishes text pixels from background pixels for segmentation-based scene text detectors.
Besides, we design a Global-information Enhanced Feature Pyramid Network (GE-FPN) for capturing text instances with macro size and extreme aspect ratios.
Finally, together with the proposed threshold learning strategy and text detection structure, we design an Adaptive Network (ASNet) for scene text detection.
arXiv Detail & Related papers (2023-07-27T17:37:56Z) - LRANet: Towards Accurate and Efficient Scene Text Detection with
Low-Rank Approximation Network [63.554061288184165]
We propose a novel parameterized text shape method based on low-rank approximation.
By exploring the shape correlation among different text contours, our method achieves consistency, compactness, simplicity, and robustness in shape representation.
We implement an accurate and efficient arbitrary-shaped text detector named LRANet.
arXiv Detail & Related papers (2023-06-27T02:03:46Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - ABCNet v2: Adaptive Bezier-Curve Network for Real-time End-to-end Text
Spotting [108.93803186429017]
End-to-end text-spotting aims to integrate detection and recognition in a unified framework.
Here, we tackle end-to-end text spotting by presenting Adaptive Bezier Curve Network v2 (ABCNet v2)
Our main contributions are four-fold: 1) For the first time, we adaptively fit arbitrarily-shaped text by a parameterized Bezier curve, which, compared with segmentation-based methods, can not only provide structured output but also controllable representation.
Comprehensive experiments conducted on various bilingual (English and Chinese) benchmark datasets demonstrate that ABCNet v2 can achieve state-of-the
arXiv Detail & Related papers (2021-05-08T07:46:55Z) - MANGO: A Mask Attention Guided One-Stage Scene Text Spotter [41.66707532607276]
We propose a novel Mask AttentioN Guided One-stage text spotting framework named MANGO.
The proposed method achieves competitive and even new state-of-the-art performance on both regular and irregular text spotting benchmarks.
arXiv Detail & Related papers (2020-12-08T10:47:49Z) - ContourNet: Taking a Further Step toward Accurate Arbitrary-shaped Scene
Text Detection [147.10751375922035]
We propose the ContourNet, which effectively handles false positives and large scale variance of scene texts.
Our method effectively suppresses these false positives by only outputting predictions with high response value in both directions.
arXiv Detail & Related papers (2020-04-10T08:15:23Z) - Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting [49.768327669098674]
We propose an end-to-end trainable text spotting approach named Text Perceptron.
It first employs an efficient segmentation-based text detector that learns the latent text reading order and boundary information.
Then a novel Shape Transform Module (abbr. STM) is designed to transform the detected feature regions into regular morphologies.
arXiv Detail & Related papers (2020-02-17T08:07:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.