RayNet: Real-time Scene Arbitrary-shape Text Detection with Multiple
Rays
- URL: http://arxiv.org/abs/2104.04903v1
- Date: Sun, 11 Apr 2021 03:03:23 GMT
- Title: RayNet: Real-time Scene Arbitrary-shape Text Detection with Multiple
Rays
- Authors: Chuang Yang, Mulin Chen, Qi Wang, and Xuelong Li
- Abstract summary: We propose a novel detection framework for arbitrary-shape text detection, termed as RayNet.
RayNet uses Center Point Set (CPS) and Ray Distance (RD) to fit text, where CPS is used to determine the text general position and the RD is combined with CPS to compute Ray Points (RP) to localize the text accurate shape.
RayNet achieves impressive performance on existing curved text dataset (CTW1500) and quadrangle text dataset (ICDAR2015)
- Score: 84.15123599963239
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing object detection-based text detectors mainly concentrate on
detecting horizontal and multioriented text. However, they do not pay enough
attention to complex-shape text (curved or other irregularly shaped text).
Recently, segmentation-based text detection methods have been introduced to
deal with the complex-shape text; however, the pixel level processing increases
the computational cost significantly. To further improve the accuracy and
efficiency, we propose a novel detection framework for arbitrary-shape text
detection, termed as RayNet. RayNet uses Center Point Set (CPS) and Ray
Distance (RD) to fit text, where CPS is used to determine the text general
position and the RD is combined with CPS to compute Ray Points (RP) to localize
the text accurate shape. Since RP are disordered, we develop the Ray Points
Connection (RPC) algorithm to reorder RP, which significantly improves the
detection performance of complex-shape text. RayNet achieves impressive
performance on existing curved text dataset (CTW1500) and quadrangle text
dataset (ICDAR2015), which demonstrate its superiority against several
state-of-the-art methods.
Related papers
- Text Region Multiple Information Perception Network for Scene Text
Detection [19.574306663095243]
This paper proposes a plug-and-play module called the Region Multiple Information Perception Module (RMIPM) to enhance the detection performance of segmentation-based algorithms.
Specifically, we design an improved module that can perceive various types of information about scene text regions, such as text foreground classification maps, distance maps, direction maps, etc.
arXiv Detail & Related papers (2024-01-18T14:36:51Z) - Adaptive Segmentation Network for Scene Text Detection [0.0]
We propose to automatically learn the discriminate segmentation threshold, which distinguishes text pixels from background pixels for segmentation-based scene text detectors.
Besides, we design a Global-information Enhanced Feature Pyramid Network (GE-FPN) for capturing text instances with macro size and extreme aspect ratios.
Finally, together with the proposed threshold learning strategy and text detection structure, we design an Adaptive Network (ASNet) for scene text detection.
arXiv Detail & Related papers (2023-07-27T17:37:56Z) - LRANet: Towards Accurate and Efficient Scene Text Detection with
Low-Rank Approximation Network [63.554061288184165]
We propose a novel parameterized text shape method based on low-rank approximation.
By exploring the shape correlation among different text contours, our method achieves consistency, compactness, simplicity, and robustness in shape representation.
We implement an accurate and efficient arbitrary-shaped text detector named LRANet.
arXiv Detail & Related papers (2023-06-27T02:03:46Z) - Video text tracking for dense and small text based on pp-yoloe-r and
sort algorithm [0.9137554315375919]
DSText is 1080 * 1920 and slicing the video frame into several areas will destroy the spatial correlation of text.
For text detection, we adopt the PP-YOLOE-R which is proven effective in small object detection.
For text detection, we use the sort algorithm for high inference speed.
arXiv Detail & Related papers (2023-03-31T05:40:39Z) - DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in
Transformer [94.35116535588332]
Transformer-based methods, which predict polygon points or Bezier curve control points to localize texts, are quite popular in scene text detection.
However, the used point label form implies the reading order of humans, which affects the robustness of Transformer model.
We propose DPText-DETR, which directly uses point coordinates as queries and dynamically updates them between decoder layers.
arXiv Detail & Related papers (2022-07-10T15:45:16Z) - Arbitrary Shape Text Detection using Transformers [2.294014185517203]
We propose an end-to-end trainable architecture for arbitrary-shaped text detection using Transformers (DETR)
At its core, our proposed method leverages a bounding box loss function that accurately measures the arbitrary detected text regions' changes in scale and aspect ratio.
We evaluate our proposed model using Total-Text and CTW-1500 datasets for curved text, and MSRA-TD500 and ICDAR15 datasets for multi-oriented text.
arXiv Detail & Related papers (2022-02-22T22:36:29Z) - RSCA: Real-time Segmentation-based Context-Aware Scene Text Detection [14.125634725954848]
We propose RSCA: a Real-time-based Context-Aware model for arbitrary-shaped scene text detection.
Based on these strategies, RSCA achieves state-of-the-art performance in both speed and accuracy, without complex label assignments or repeated feature aggregations.
arXiv Detail & Related papers (2021-05-26T18:43:17Z) - PAN++: Towards Efficient and Accurate End-to-End Spotting of
Arbitrarily-Shaped Text [85.7020597476857]
We propose an end-to-end text spotting framework, termed PAN++, which can efficiently detect and recognize text of arbitrary shapes in natural scenes.
PAN++ is based on the kernel representation that reformulates a text line as a text kernel (central region) surrounded by peripheral pixels.
As a pixel-based representation, the kernel representation can be predicted by a single fully convolutional network, which is very friendly to real-time applications.
arXiv Detail & Related papers (2021-05-02T07:04:30Z) - PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering
Network [54.03560668182197]
We propose a novel fully convolutional Point Gathering Network (PGNet) for reading arbitrarily-shaped text in real-time.
With a PG-CTC decoder, we gather high-level character classification vectors from two-dimensional space and decode them into text symbols without NMS and RoI operations.
Experiments prove that the proposed method achieves competitive accuracy, meanwhile significantly improving the running speed.
arXiv Detail & Related papers (2021-04-12T13:27:34Z) - Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting [49.768327669098674]
We propose an end-to-end trainable text spotting approach named Text Perceptron.
It first employs an efficient segmentation-based text detector that learns the latent text reading order and boundary information.
Then a novel Shape Transform Module (abbr. STM) is designed to transform the detected feature regions into regular morphologies.
arXiv Detail & Related papers (2020-02-17T08:07:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.