Arbitrary Shape Text Detection via Boundary Transformer
- URL: http://arxiv.org/abs/2205.05320v4
- Date: Tue, 20 Jun 2023 03:00:29 GMT
- Title: Arbitrary Shape Text Detection via Boundary Transformer
- Authors: Shi-Xue Zhang, Chun Yang, Xiaobin Zhu, Xu-Cheng Yin
- Abstract summary: We present a unified coarse-to-fine framework via boundary learning for arbitrary shape text detection.
We explicitly model the text boundary via an innovative iterative boundary transformer in a coarse-to-fine manner.
Our method can directly gain accurate text boundaries and abandon complex post-processing to improve efficiency.
- Score: 18.229219867056347
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In arbitrary shape text detection, locating accurate text boundaries is
challenging and non-trivial. Existing methods often suffer from indirect text
boundary modeling or complex post-processing. In this paper, we systematically
present a unified coarse-to-fine framework via boundary learning for arbitrary
shape text detection, which can accurately and efficiently locate text
boundaries without post-processing. In our method, we explicitly model the text
boundary via an innovative iterative boundary transformer in a coarse-to-fine
manner. In this way, our method can directly gain accurate text boundaries and
abandon complex post-processing to improve efficiency. Specifically, our method
mainly consists of a feature extraction backbone, a boundary proposal module,
and an iteratively optimized boundary transformer module. The boundary proposal
module consisting of multi-layer dilated convolutions will compute important
prior information (including classification map, distance field, and direction
field) for generating coarse boundary proposals while guiding the boundary
transformer's optimization. The boundary transformer module adopts an
encoder-decoder structure, in which the encoder is constructed by multi-layer
transformer blocks with residual connection while the decoder is a simple
multi-layer perceptron network (MLP). Under the guidance of prior information,
the boundary transformer module will gradually refine the coarse boundary
proposals via iterative boundary deformation. Furthermore, we propose a novel
boundary energy loss (BEL) which introduces an energy minimization constraint
and an energy monotonically decreasing constraint to further optimize and
stabilize the learning of boundary refinement. Extensive experiments on
publicly available and challenging datasets demonstrate the state-of-the-art
performance and promising efficiency of our method.
Related papers
- CT-Net: Arbitrary-Shaped Text Detection via Contour Transformer [19.269070203448187]
We propose a novel arbitrary-shaped scene text detection framework named CT-Net by progressive contour regression with contour transformers.
CT-Net achieves F-measure of 86.1 at 11.2 frames per second (FPS) and F-measure of 87.8 at 10.1 FPS for CTW1500 and Total-Text datasets, respectively.
arXiv Detail & Related papers (2023-07-25T08:00:40Z) - SegT: A Novel Separated Edge-guidance Transformer Network for Polyp
Segmentation [10.144870911523622]
We propose a novel separated edge-guidance transformer (SegT) network that aims to build an effective polyp segmentation model.
A transformer encoder that learns a more robust representation than existing CNN-based approaches was specifically applied.
To evaluate the effectiveness of SegT, we conducted experiments with five challenging public datasets.
arXiv Detail & Related papers (2023-06-19T08:32:05Z) - An Extensible Plug-and-Play Method for Multi-Aspect Controllable Text
Generation [70.77243918587321]
Multi-aspect controllable text generation that controls generated text in multiple aspects has attracted increasing attention.
We provide a theoretical lower bound for the interference and empirically found that the interference grows with the number of layers where prefixes are inserted.
We propose using trainable gates to normalize the intervention of prefixes to restrain the growing interference.
arXiv Detail & Related papers (2022-12-19T11:53:59Z) - Zero Pixel Directional Boundary by Vector Transform [77.63061686394038]
We re-interpret boundaries as 1-D surfaces and formulate a one-to-one vector transform function that allows for training of boundary prediction completely avoiding the class imbalance issue.
Our problem formulation leads to the estimation of direction as well as richer contextual information of the boundary, and, if desired, the availability of zero-pixel thin boundaries also at training time.
arXiv Detail & Related papers (2022-03-16T17:55:31Z) - Real-Time Scene Text Detection with Differentiable Binarization and
Adaptive Scale Fusion [62.269219152425556]
segmentation-based scene text detection methods have drawn extensive attention in the scene text detection field.
We propose a Differentiable Binarization (DB) module that integrates the binarization process into a segmentation network.
An efficient Adaptive Scale Fusion (ASF) module is proposed to improve the scale robustness by fusing features of different scales adaptively.
arXiv Detail & Related papers (2022-02-21T15:30:14Z) - Adaptive Boundary Proposal Network for Arbitrary Shape Text Detection [18.491440228386313]
We propose a novel adaptive boundary proposal network for arbitrary shape text detection.
Our method can learn to directly produce accurate boundary for arbitrary shape text without any post-processing.
arXiv Detail & Related papers (2021-07-27T08:25:24Z) - BoundarySqueeze: Image Segmentation as Boundary Squeezing [104.43159799559464]
We propose a novel method for fine-grained high-quality image segmentation of both objects and scenes.
Inspired by dilation and erosion from morphological image processing techniques, we treat the pixel level segmentation problems as squeezing object boundary.
Our method yields large gains on COCO, Cityscapes, for both instance and semantic segmentation and outperforms previous state-of-the-art PointRend in both accuracy and speed under the same setting.
arXiv Detail & Related papers (2021-05-25T04:58:51Z) - Active Boundary Loss for Semantic Segmentation [58.72057610093194]
This paper proposes a novel active boundary loss for semantic segmentation.
It can progressively encourage the alignment between predicted boundaries and ground-truth boundaries during end-to-end training.
Experimental results show that training with the active boundary loss can effectively improve the boundary F-score and mean Intersection-over-Union.
arXiv Detail & Related papers (2021-02-04T15:47:54Z) - Think about boundary: Fusing multi-level boundary information for
landmark heatmap regression [51.48533538153833]
We study a two-stage but end-to-end approach for exploring the relationship between the facial boundary and landmarks.
We get boundary-aware landmark predictions, which consists of two modules: the self-calibrated boundary estimation (SCBE) module and the boundary-aware landmark transform (BALT) module.
Our approach outperforms state-of-the-art methods in the literature.
arXiv Detail & Related papers (2020-08-25T10:14:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.