Progressive Evolution from Single-Point to Polygon for Scene Text
- URL: http://arxiv.org/abs/2312.13778v3
- Date: Fri, 10 May 2024 14:01:07 GMT
- Title: Progressive Evolution from Single-Point to Polygon for Scene Text
- Authors: Linger Deng, Mingxin Huang, Xudong Xie, Yuliang Liu, Lianwen Jin, Xiang Bai,
- Abstract summary: We introduce Point2Polygon, which can efficiently transform single-points into compact polygons.
Our method uses a coarse-to-fine process, starting with creating anchor points based on recognition confidence, then vertically and horizontally refining the polygon.
In training detectors with polygons generated by our method, we attained 86% of the accuracy relative to training with ground truth (GT); 3) Additionally, the proposed Point2Polygon can be seamlessly integrated to empower single-point spotters to generate polygons.
- Score: 79.29097971932529
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The advancement of text shape representations towards compactness has enhanced text detection and spotting performance, but at a high annotation cost. Current models use single-point annotations to reduce costs, yet they lack sufficient localization information for downstream applications. To overcome this limitation, we introduce Point2Polygon, which can efficiently transform single-points into compact polygons. Our method uses a coarse-to-fine process, starting with creating and selecting anchor points based on recognition confidence, then vertically and horizontally refining the polygon using recognition information to optimize its shape. We demonstrate the accuracy of the generated polygons through extensive experiments: 1) By creating polygons from ground truth points, we achieved an accuracy of 82.0% on ICDAR 2015; 2) In training detectors with polygons generated by our method, we attained 86% of the accuracy relative to training with ground truth (GT); 3) Additionally, the proposed Point2Polygon can be seamlessly integrated to empower single-point spotters to generate polygons. This integration led to an impressive 82.5% accuracy for the generated polygons. It is worth mentioning that our method relies solely on synthetic recognition information, eliminating the need for any manual annotation beyond single points.
Related papers
- ConDaFormer: Disassembled Transformer with Local Structure Enhancement
for 3D Point Cloud Understanding [105.98609765389895]
Transformers have been recently explored for 3D point cloud understanding.
A large number of points, over 0.1 million, make the global self-attention infeasible for point cloud data.
In this paper, we develop a new transformer block, named ConDaFormer.
arXiv Detail & Related papers (2023-12-18T11:19:45Z) - DPPD: Deformable Polar Polygon Object Detection [3.9236649268347765]
We develop a novel Deformable Polar Polygon Object Detection method (DPPD) to detect objects in polygon shapes.
DPPD has been demonstrated successfully in various object detection tasks for autonomous driving.
arXiv Detail & Related papers (2023-04-05T06:43:41Z) - PolyBuilding: Polygon Transformer for End-to-End Building Extraction [9.196604757138825]
PolyBuilding predicts vector representation of buildings from remote sensing images.
Model learns the relations among them and encodes context information from the image to predict the final set of building polygons.
It also achieves a new state-of-the-art in terms of pixel-level coverage, instance-level precision and recall, and geometry-level properties.
arXiv Detail & Related papers (2022-11-03T04:53:17Z) - Towards General-Purpose Representation Learning of Polygonal Geometries [62.34832826705641]
We develop a general-purpose polygon encoding model, which can encode a polygonal geometry into an embedding space.
We conduct experiments on two tasks: 1) shape classification based on MNIST; 2) spatial relation prediction based on two new datasets - DBSR-46K and DBSR-cplx46K.
Our results show that NUFTspec and ResNet1D outperform multiple existing baselines with significant margins.
arXiv Detail & Related papers (2022-09-29T15:59:23Z) - PolyWorld: Polygonal Building Extraction with Graph Neural Networks in
Satellite Images [10.661430927191205]
This paper introduces PolyWorld, a neural network that directly extracts building vertices from an image and connects them correctly to create precise polygons.
PolyWorld significantly outperforms the state-of-the-art in building polygonization.
arXiv Detail & Related papers (2021-11-30T15:23:17Z) - PolyNet: Polynomial Neural Network for 3D Shape Recognition with
PolyShape Representation [51.147664305955495]
3D shape representation and its processing have substantial effects on 3D shape recognition.
We propose a deep neural network-based method (PolyNet) and a specific polygon representation (PolyShape)
Our experiments demonstrate the strength and the advantages of PolyNet on both 3D shape classification and retrieval tasks.
arXiv Detail & Related papers (2021-10-15T06:45:59Z) - CenterPoly: real-time instance segmentation using bounding polygons [11.365829102707014]
We present a novel method, called CenterPoly, for real-time instance segmentation using bounding polygons.
We apply it to detect road users in dense urban environments, making it suitable for applications in intelligent transportation systems like automated vehicles.
Most of the network parameters are shared by the network heads, making it fast and lightweight enough to run at real-time speed.
arXiv Detail & Related papers (2021-08-19T21:31:30Z) - Polygonal Point Set Tracking [50.445151155209246]
We propose a novel learning-based polygonal point set tracking method.
Our goal is to track corresponding points on the target contour.
We present visual-effects applications of our method on part distortion and text mapping.
arXiv Detail & Related papers (2021-05-30T17:12:36Z) - SPU-Net: Self-Supervised Point Cloud Upsampling by Coarse-to-Fine
Reconstruction with Self-Projection Optimization [52.20602782690776]
It is expensive and tedious to obtain large scale paired sparse-canned point sets for training from real scanned sparse data.
We propose a self-supervised point cloud upsampling network, named SPU-Net, to capture the inherent upsampling patterns of points lying on the underlying object surface.
We conduct various experiments on both synthetic and real-scanned datasets, and the results demonstrate that we achieve comparable performance to the state-of-the-art supervised methods.
arXiv Detail & Related papers (2020-12-08T14:14:09Z) - Polygon-free: Unconstrained Scene Text Detection with Box Annotations [39.74109294551322]
This study proposes an unconstrained text detection system termed Polygon-free (PF)
PF is trained with only upright bounding box annotations.
Experiments demonstrate that PF can combine general detectors to yield surprisingly high-quality pixel-level results.
arXiv Detail & Related papers (2020-11-26T14:19:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.