Bidirectional Regression for Arbitrary-Shaped Text Detection
- URL: http://arxiv.org/abs/2107.06129v1
- Date: Tue, 13 Jul 2021 14:29:09 GMT
- Title: Bidirectional Regression for Arbitrary-Shaped Text Detection
- Authors: Tao Sheng, Zhouhui Lian
- Abstract summary: This paper presents a novel text instance expression which integrates both foreground and background information into the pipeline.
A corresponding post-processing algorithm is also designed to sequentially combine the four prediction results and reconstruct the text instance accurately.
We evaluate our method on several challenging scene text benchmarks, including both curved and multi-oriented text datasets.
- Score: 16.30976392505236
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Arbitrary-shaped text detection has recently attracted increasing interests
and witnessed rapid development with the popularity of deep learning
algorithms. Nevertheless, existing approaches often obtain inaccurate detection
results, mainly due to the relatively weak ability to utilize context
information and the inappropriate choice of offset references. This paper
presents a novel text instance expression which integrates both foreground and
background information into the pipeline, and naturally uses the pixels near
text boundaries as the offset starts. Besides, a corresponding post-processing
algorithm is also designed to sequentially combine the four prediction results
and reconstruct the text instance accurately. We evaluate our method on several
challenging scene text benchmarks, including both curved and multi-oriented
text datasets. Experimental results demonstrate that the proposed approach
obtains superior or competitive performance compared to other state-of-the-art
methods, e.g., 83.4% F-score for Total-Text, 82.4% F-score for MSRA-TD500, etc.
Related papers
- Efficiently Leveraging Linguistic Priors for Scene Text Spotting [63.22351047545888]
This paper proposes a method that leverages linguistic knowledge from a large text corpus to replace the traditional one-hot encoding used in auto-regressive scene text spotting and recognition models.
We generate text distributions that align well with scene text datasets, removing the need for in-domain fine-tuning.
Experimental results show that our method not only improves recognition accuracy but also enables more accurate localization of words.
arXiv Detail & Related papers (2024-02-27T01:57:09Z) - LRANet: Towards Accurate and Efficient Scene Text Detection with
Low-Rank Approximation Network [63.554061288184165]
We propose a novel parameterized text shape method based on low-rank approximation.
By exploring the shape correlation among different text contours, our method achieves consistency, compactness, simplicity, and robustness in shape representation.
We implement an accurate and efficient arbitrary-shaped text detector named LRANet.
arXiv Detail & Related papers (2023-06-27T02:03:46Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - Arbitrary Shape Text Detection using Transformers [2.294014185517203]
We propose an end-to-end trainable architecture for arbitrary-shaped text detection using Transformers (DETR)
At its core, our proposed method leverages a bounding box loss function that accurately measures the arbitrary detected text regions' changes in scale and aspect ratio.
We evaluate our proposed model using Total-Text and CTW-1500 datasets for curved text, and MSRA-TD500 and ICDAR15 datasets for multi-oriented text.
arXiv Detail & Related papers (2022-02-22T22:36:29Z) - Comprehensive Studies for Arbitrary-shape Scene Text Detection [78.50639779134944]
We propose a unified framework for the bottom-up based scene text detection methods.
Under the unified framework, we ensure the consistent settings for non-core modules.
With the comprehensive investigations and elaborate analyses, it reveals the advantages and disadvantages of previous models.
arXiv Detail & Related papers (2021-07-25T13:18:55Z) - RSCA: Real-time Segmentation-based Context-Aware Scene Text Detection [14.125634725954848]
We propose RSCA: a Real-time-based Context-Aware model for arbitrary-shaped scene text detection.
Based on these strategies, RSCA achieves state-of-the-art performance in both speed and accuracy, without complex label assignments or repeated feature aggregations.
arXiv Detail & Related papers (2021-05-26T18:43:17Z) - MOST: A Multi-Oriented Scene Text Detector with Localization Refinement [67.35280008722255]
We propose a new algorithm for scene text detection, which puts forward a set of strategies to significantly improve the quality of text localization.
Specifically, a Text Feature Alignment Module (TFAM) is proposed to dynamically adjust the receptive fields of features.
A Position-Aware Non-Maximum Suppression (PA-NMS) module is devised to exclude unreliable ones.
arXiv Detail & Related papers (2021-04-02T14:34:41Z) - FC2RN: A Fully Convolutional Corner Refinement Network for Accurate
Multi-Oriented Scene Text Detection [16.722639253025996]
A fully convolutional corner refinement network (FC2RN) is proposed for accurate multi-oriented text detection.
With a novel quadrilateral RoI convolution operation tailed for multi-oriented scene text, the initial quadrilateral prediction is encoded into the feature maps.
arXiv Detail & Related papers (2020-07-10T00:04:24Z) - Text Recognition -- Real World Data and Where to Find Them [36.10220484561196]
We present a method for exploiting weakly annotated images to improve text extraction pipelines.
The approach uses an arbitrary end-to-end text recognition system to obtain text region proposals and their, possibly erroneous, transcriptions.
It produces nearly error-free, localised instances of scene text, which we treat as "pseudo ground truth" (PGT)
arXiv Detail & Related papers (2020-07-06T22:23:27Z) - Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting [49.768327669098674]
We propose an end-to-end trainable text spotting approach named Text Perceptron.
It first employs an efficient segmentation-based text detector that learns the latent text reading order and boundary information.
Then a novel Shape Transform Module (abbr. STM) is designed to transform the detected feature regions into regular morphologies.
arXiv Detail & Related papers (2020-02-17T08:07:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.