Edge Approximation Text Detector
- URL: http://arxiv.org/abs/2504.04001v1
- Date: Sat, 05 Apr 2025 00:12:51 GMT
- Title: Edge Approximation Text Detector
- Authors: Chuang Yang, Xu Han, Tao Han, Han Han, Bingxuan Zhao, Qi Wang,
- Abstract summary: We introduce EdgeText to fit text contours compactly while alleviating excessive contour rebuilding processes.<n>Inspired by this observation, EdgeText formulates the text representation as the edge approximation problem via parameterized curve fitting functions.<n>Considering the deep dependency of EdgeText on text edges, a bilateral enhanced perception (BEP) module is designed.
- Score: 15.968342484512325
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pursuing efficient text shape representations helps scene text detection models focus on compact foreground regions and optimize the contour reconstruction steps to simplify the whole detection pipeline. Current approaches either represent irregular shapes via box-to-polygon strategy or decomposing a contour into pieces for fitting gradually, the deficiency of coarse contours or complex pipelines always exists in these models. Considering the above issues, we introduce EdgeText to fit text contours compactly while alleviating excessive contour rebuilding processes. Concretely, it is observed that the two long edges of texts can be regarded as smooth curves. It allows us to build contours via continuous and smooth edges that cover text regions tightly instead of fitting piecewise, which helps avoid the two limitations in current models. Inspired by this observation, EdgeText formulates the text representation as the edge approximation problem via parameterized curve fitting functions. In the inference stage, our model starts with locating text centers, and then creating curve functions for approximating text edges relying on the points. Meanwhile, truncation points are determined based on the location features. In the end, extracting curve segments from curve functions by using the pixel coordinate information brought by truncation points to reconstruct text contours. Furthermore, considering the deep dependency of EdgeText on text edges, a bilateral enhanced perception (BEP) module is designed. It encourages our model to pay attention to the recognition of edge features. Additionally, to accelerate the learning of the curve function parameters, we introduce a proportional integral loss (PI-loss) to force the proposed model to focus on the curve distribution and avoid being disturbed by text scales.
Related papers
- PBFormer: Capturing Complex Scene Text Shape with Polynomial Band
Transformer [28.52028534365144]
We present PBFormer, an efficient yet powerful scene text detector.
It unifies a transformer with a novel text shape shape Band (PB)
The simple operation can help detect small-scale texts.
arXiv Detail & Related papers (2023-08-29T03:41:27Z) - LRANet: Towards Accurate and Efficient Scene Text Detection with
Low-Rank Approximation Network [63.554061288184165]
We propose a novel parameterized text shape method based on low-rank approximation.
By exploring the shape correlation among different text contours, our method achieves consistency, compactness, simplicity, and robustness in shape representation.
We implement an accurate and efficient arbitrary-shaped text detector named LRANet.
arXiv Detail & Related papers (2023-06-27T02:03:46Z) - Deformable Kernel Expansion Model for Efficient Arbitrary-shaped Scene
Text Detection [15.230957275277762]
We propose a scene text detector named Deformable Kernel Expansion (DKE)
DKE employs a segmentation module to segment the shrunken text region as the text kernel, then expands the text kernel contour to obtain text boundary.
Experiments on CTW1500, Total-Text, MSRA-TD500, and ICDAR2015 demonstrate that DKE achieves a good tradeoff between accuracy and efficiency in scene text detection.
arXiv Detail & Related papers (2023-03-28T05:18:58Z) - Text Spotting Transformers [29.970268691631333]
TESTR builds upon a single encoder and dual decoders for the joint text-box control point regression and character recognition.
We show our canonical representation of control points suitable for text instances in both Bezier curve and annotations.
In addition, we design a bounding-box guided detection (box-to-polygon) process.
arXiv Detail & Related papers (2022-04-05T01:05:31Z) - Arbitrary Shape Text Detection using Transformers [2.294014185517203]
We propose an end-to-end trainable architecture for arbitrary-shaped text detection using Transformers (DETR)
At its core, our proposed method leverages a bounding box loss function that accurately measures the arbitrary detected text regions' changes in scale and aspect ratio.
We evaluate our proposed model using Total-Text and CTW-1500 datasets for curved text, and MSRA-TD500 and ICDAR15 datasets for multi-oriented text.
arXiv Detail & Related papers (2022-02-22T22:36:29Z) - Polygonal Point Set Tracking [50.445151155209246]
We propose a novel learning-based polygonal point set tracking method.
Our goal is to track corresponding points on the target contour.
We present visual-effects applications of our method on part distortion and text mapping.
arXiv Detail & Related papers (2021-05-30T17:12:36Z) - ABCNet v2: Adaptive Bezier-Curve Network for Real-time End-to-end Text
Spotting [108.93803186429017]
End-to-end text-spotting aims to integrate detection and recognition in a unified framework.
Here, we tackle end-to-end text spotting by presenting Adaptive Bezier Curve Network v2 (ABCNet v2)
Our main contributions are four-fold: 1) For the first time, we adaptively fit arbitrarily-shaped text by a parameterized Bezier curve, which, compared with segmentation-based methods, can not only provide structured output but also controllable representation.
Comprehensive experiments conducted on various bilingual (English and Chinese) benchmark datasets demonstrate that ABCNet v2 can achieve state-of-the
arXiv Detail & Related papers (2021-05-08T07:46:55Z) - SOLD2: Self-supervised Occlusion-aware Line Description and Detection [95.8719432775724]
We introduce the first joint detection and description of line segments in a single deep network.
Our method does not require any annotated line labels and can therefore generalize to any dataset.
We evaluate our approach against previous line detection and description methods on several multi-view datasets.
arXiv Detail & Related papers (2021-04-07T19:27:17Z) - Quantization in Relative Gradient Angle Domain For Building Polygon
Estimation [88.80146152060888]
CNN approaches often generate imprecise building morphologies including noisy edges and round corners.
We propose a module that uses prior knowledge of building corners to create angular and concise building polygons from CNN segmentation outputs.
Experimental results demonstrate that our method refines CNN output from a rounded approximation to a more clear-cut angular shape of the building footprint.
arXiv Detail & Related papers (2020-07-10T21:33:06Z) - Deep Hough Transform for Semantic Line Detection [70.28969017874587]
We focus on a fundamental task of detecting meaningful line structures, a.k.a. semantic lines, in natural scenes.
Previous methods neglect the inherent characteristics of lines, leading to sub-optimal performance.
We propose a one-shot end-to-end learning framework for line detection.
arXiv Detail & Related papers (2020-03-10T13:08:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.