Detection and Rectification of Arbitrary Shaped Scene Texts by using
Text Keypoints and Links
- URL: http://arxiv.org/abs/2103.00785v1
- Date: Mon, 1 Mar 2021 06:13:51 GMT
- Title: Detection and Rectification of Arbitrary Shaped Scene Texts by using
Text Keypoints and Links
- Authors: Chuhui Xue, Shijian Lu, Steven Hoi
- Abstract summary: Mask-guided multi-task network detects and rectifies scene texts of arbitrary shapes reliably.
Three types of keypoints are detected which specify the centre line and so the shape of text instances accurately.
Scene texts can be located and rectified by linking up the associated landmark points.
- Score: 38.71967078941593
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detection and recognition of scene texts of arbitrary shapes remain a grand
challenge due to the super-rich text shape variation in text line orientations,
lengths, curvatures, etc. This paper presents a mask-guided multi-task network
that detects and rectifies scene texts of arbitrary shapes reliably. Three
types of keypoints are detected which specify the centre line and so the shape
of text instances accurately. In addition, four types of keypoint links are
detected of which the horizontal links associate the detected keypoints of each
text instance and the vertical links predict a pair of landmark points (for
each keypoint) along the upper and lower text boundary, respectively. Scene
texts can be located and rectified by linking up the associated landmark points
(giving localization polygon boxes) and transforming the polygon boxes via thin
plate spline, respectively. Extensive experiments over several public datasets
show that the use of text keypoints is tolerant to the variation in text
orientations, lengths, and curvatures, and it achieves superior scene text
detection and rectification performance as compared with state-of-the-art
methods.
Related papers
- Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis [52.01356859448068]
HTS can recognize text in an image and identify its 4-level hierarchical structure: characters, words, lines, and paragraphs.
HTS achieves state-of-the-art results on multiple word-level text spotting benchmark datasets as well as geometric layout analysis tasks.
arXiv Detail & Related papers (2023-10-25T22:23:54Z) - Aggregated Text Transformer for Scene Text Detection [5.387121933662753]
We present the Aggregated Text TRansformer(ATTR), which is designed to represent texts in scene images with a multi-scale self-attention mechanism.
The multi-scale image representations are robust and contain rich information on text contents of various sizes.
The proposed method detects scene texts by representing each text instance as an individual binary mask, which is tolerant of curve texts and regions with dense instances.
arXiv Detail & Related papers (2022-11-25T09:47:34Z) - Toward Understanding WordArt: Corner-Guided Transformer for Scene Text
Recognition [63.6608759501803]
We propose to recognize artistic text at three levels.
corner points are applied to guide the extraction of local features inside characters, considering the robustness of corner structures to appearance and shape.
Secondly, we design a character contrastive loss to model the character-level feature, improving the feature representation for character classification.
Thirdly, we utilize Transformer to learn the global feature on image-level and model the global relationship of the corner points.
arXiv Detail & Related papers (2022-07-31T14:11:05Z) - DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in
Transformer [94.35116535588332]
Transformer-based methods, which predict polygon points or Bezier curve control points to localize texts, are quite popular in scene text detection.
However, the used point label form implies the reading order of humans, which affects the robustness of Transformer model.
We propose DPText-DETR, which directly uses point coordinates as queries and dynamically updates them between decoder layers.
arXiv Detail & Related papers (2022-07-10T15:45:16Z) - Text Spotting Transformers [29.970268691631333]
TESTR builds upon a single encoder and dual decoders for the joint text-box control point regression and character recognition.
We show our canonical representation of control points suitable for text instances in both Bezier curve and annotations.
In addition, we design a bounding-box guided detection (box-to-polygon) process.
arXiv Detail & Related papers (2022-04-05T01:05:31Z) - SwinTextSpotter: Scene Text Spotting via Better Synergy between Text
Detection and Text Recognition [73.61592015908353]
We propose a new end-to-end scene text spotting framework termed SwinTextSpotter.
Using a transformer with dynamic head as the detector, we unify the two tasks with a novel Recognition Conversion mechanism.
The design results in a concise framework that requires neither additional rectification module nor character-level annotation.
arXiv Detail & Related papers (2022-03-19T01:14:42Z) - DEER: Detection-agnostic End-to-End Recognizer for Scene Text Spotting [11.705454066278898]
We propose a novel Detection-agnostic End-to-End Recognizer, DEER, framework.
The proposed method reduces the tight dependency between detection and recognition modules.
It achieves competitive results on regular and arbitrarily-shaped text spotting benchmarks.
arXiv Detail & Related papers (2022-03-10T02:41:05Z) - Arbitrary Shape Text Detection using Transformers [2.294014185517203]
We propose an end-to-end trainable architecture for arbitrary-shaped text detection using Transformers (DETR)
At its core, our proposed method leverages a bounding box loss function that accurately measures the arbitrary detected text regions' changes in scale and aspect ratio.
We evaluate our proposed model using Total-Text and CTW-1500 datasets for curved text, and MSRA-TD500 and ICDAR15 datasets for multi-oriented text.
arXiv Detail & Related papers (2022-02-22T22:36:29Z) - Scene Text Detection with Scribble Lines [59.698806258671105]
We propose to annotate texts by scribble lines instead of polygons for text detection.
It is a general labeling method for texts with various shapes and requires low labeling costs.
Experiments show that the proposed method bridges the performance gap between the weakly labeling method and the original polygon-based labeling methods.
arXiv Detail & Related papers (2020-12-09T13:14:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.