ContourNet: Taking a Further Step toward Accurate Arbitrary-shaped Scene
Text Detection
- URL: http://arxiv.org/abs/2004.04940v1
- Date: Fri, 10 Apr 2020 08:15:23 GMT
- Title: ContourNet: Taking a Further Step toward Accurate Arbitrary-shaped Scene
Text Detection
- Authors: Yuxin Wang, Hongtao Xie, Zhengjun Zha, Mengting Xing, Zilong Fu and
Yongdong Zhang
- Abstract summary: We propose the ContourNet, which effectively handles false positives and large scale variance of scene texts.
Our method effectively suppresses these false positives by only outputting predictions with high response value in both directions.
- Score: 147.10751375922035
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scene text detection has witnessed rapid development in recent years.
However, there still exists two main challenges: 1) many methods suffer from
false positives in their text representations; 2) the large scale variance of
scene texts makes it hard for network to learn samples. In this paper, we
propose the ContourNet, which effectively handles these two problems taking a
further step toward accurate arbitrary-shaped text detection. At first, a
scale-insensitive Adaptive Region Proposal Network (Adaptive-RPN) is proposed
to generate text proposals by only focusing on the Intersection over Union
(IoU) values between predicted and ground-truth bounding boxes. Then a novel
Local Orthogonal Texture-aware Module (LOTM) models the local texture
information of proposal features in two orthogonal directions and represents
text region with a set of contour points. Considering that the strong
unidirectional or weakly orthogonal activation is usually caused by the
monotonous texture characteristic of false-positive patterns (e.g. streaks.),
our method effectively suppresses these false positives by only outputting
predictions with high response value in both orthogonal directions. This gives
more accurate description of text regions. Extensive experiments on three
challenging datasets (Total-Text, CTW1500 and ICDAR2015) verify that our method
achieves the state-of-the-art performance. Code is available at
https://github.com/wangyuxin87/ContourNet.
Related papers
- LRANet: Towards Accurate and Efficient Scene Text Detection with
Low-Rank Approximation Network [63.554061288184165]
We propose a novel parameterized text shape method based on low-rank approximation.
By exploring the shape correlation among different text contours, our method achieves consistency, compactness, simplicity, and robustness in shape representation.
We implement an accurate and efficient arbitrary-shaped text detector named LRANet.
arXiv Detail & Related papers (2023-06-27T02:03:46Z) - SpaText: Spatio-Textual Representation for Controllable Image Generation [61.89548017729586]
SpaText is a new method for text-to-image generation using open-vocabulary scene control.
In addition to a global text prompt that describes the entire scene, the user provides a segmentation map.
We show its effectiveness on two state-of-the-art diffusion models: pixel-based and latent-conditional-based.
arXiv Detail & Related papers (2022-11-25T18:59:10Z) - Which and Where to Focus: A Simple yet Accurate Framework for
Arbitrary-Shaped Nearby Text Detection in Scene Images [8.180563824325086]
We propose a simple yet effective method for accurate arbitrary-shaped nearby scene text detection.
A One-to-Many Training Scheme (OMTS) is designed to eliminate confusion and enable the proposals to learn more appropriate groundtruths.
We also propose a Proposal Feature Attention Module (PFAM) to exploit more effective features for each proposal.
arXiv Detail & Related papers (2021-09-08T06:25:37Z) - CentripetalText: An Efficient Text Instance Representation for Scene
Text Detection [19.69057252363207]
We propose an efficient text instance representation named CentripetalText (CT)
CT decomposes text instances into the combination of text kernels and centripetal shifts.
For the task of scene text detection, our approach achieves superior or competitive performance compared to other existing methods.
arXiv Detail & Related papers (2021-07-13T09:34:18Z) - AE TextSpotter: Learning Visual and Linguistic Representation for
Ambiguous Text Spotting [98.08853679310603]
This work proposes a novel text spotter, named Ambiguity Eliminating Text Spotter (AE TextSpotter)
AE TextSpotter learns both visual and linguistic features to significantly reduce ambiguity in text detection.
To our knowledge, it is the first time to improve text detection by using a language model.
arXiv Detail & Related papers (2020-08-03T08:40:01Z) - Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text
Spotting [71.6244869235243]
Most arbitrary-shape scene text spotters use region proposal networks (RPN) to produce proposals.
Our Mask TextSpotter v3 can handle text instances of extreme aspect ratios or irregular shapes, and its recognition accuracy won't be affected by nearby text or background noise.
arXiv Detail & Related papers (2020-07-18T17:25:50Z) - FC2RN: A Fully Convolutional Corner Refinement Network for Accurate
Multi-Oriented Scene Text Detection [16.722639253025996]
A fully convolutional corner refinement network (FC2RN) is proposed for accurate multi-oriented text detection.
With a novel quadrilateral RoI convolution operation tailed for multi-oriented scene text, the initial quadrilateral prediction is encoded into the feature maps.
arXiv Detail & Related papers (2020-07-10T00:04:24Z) - Text Recognition -- Real World Data and Where to Find Them [36.10220484561196]
We present a method for exploiting weakly annotated images to improve text extraction pipelines.
The approach uses an arbitrary end-to-end text recognition system to obtain text region proposals and their, possibly erroneous, transcriptions.
It produces nearly error-free, localised instances of scene text, which we treat as "pseudo ground truth" (PGT)
arXiv Detail & Related papers (2020-07-06T22:23:27Z) - Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting [49.768327669098674]
We propose an end-to-end trainable text spotting approach named Text Perceptron.
It first employs an efficient segmentation-based text detector that learns the latent text reading order and boundary information.
Then a novel Shape Transform Module (abbr. STM) is designed to transform the detected feature regions into regular morphologies.
arXiv Detail & Related papers (2020-02-17T08:07:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.