TPSNet: Thin-Plate-Spline Representation for Arbitrary Shape Scene Text
Detection
- URL: http://arxiv.org/abs/2110.12826v1
- Date: Mon, 25 Oct 2021 11:47:17 GMT
- Title: TPSNet: Thin-Plate-Spline Representation for Arbitrary Shape Scene Text
Detection
- Authors: Wei Wang
- Abstract summary: Thin-Plate-Spline (TPS) transformation has achieved great success in scene text recognition.
TPS representation is compact, complete and integral.
Two novel losses including the boundary set loss and the shape alignment loss are proposed.
- Score: 4.8345307057837354
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The research focus of scene text detection has shifted to arbitrary shape
text in recent years, in which text representation is a fundamental problem. An
ideal representation should be compact, complete, integral, and reusable for
subsequent recognition in our opinion. However, previous representations suffer
from one or several aspects. Thin-Plate-Spline (TPS) transformation has
achieved great success in scene text recognition. Inspired from this, we
reversely think its usage and sophisticatedly take TPS as an exquisite
representation for arbitrary shape text detection. The TPS representation is
compact, complete and integral, and with the predicted TPS parameters, the
detected text region can be rectified to near-horizontal one which is
beneficial for subsequent recognition. To solve the supervision problem of TPS
training without key point annotations, two novel losses including the boundary
set loss and the shape alignment loss are proposed. Extensive evaluation and
ablation on several public benchmarks demonstrate the effectiveness and
superiority of the proposed method.
Related papers
- TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition [78.67283660198403]
Text irregularities pose significant challenges to scene text recognizers.
TPS++ is an attention-enhanced TPS transformation that incorporates the attention mechanism to text rectification.
It consistently improves the recognition and achieves state-of-the-art accuracy.
arXiv Detail & Related papers (2023-05-09T10:16:43Z) - Few Could Be Better Than All: Feature Sampling and Grouping for Scene
Text Detection [47.820683360286786]
We present a transformer-based architecture for scene text detection.
We first select a few representative features at all scales that are highly relevant to foreground text.
As each feature group corresponds to a text instance, its bounding box can be easily obtained without any post-processing operation.
arXiv Detail & Related papers (2022-03-29T04:02:31Z) - Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer [21.479222207347238]
We introduce TextTranSpotter (TTS), a transformer-based approach for text spotting.
TTS is trained with both fully- and weakly-supervised settings.
trained in a fully-supervised manner, TextTranSpotter shows state-of-the-art results on multiple benchmarks.
arXiv Detail & Related papers (2022-02-11T08:50:09Z) - Which and Where to Focus: A Simple yet Accurate Framework for
Arbitrary-Shaped Nearby Text Detection in Scene Images [8.180563824325086]
We propose a simple yet effective method for accurate arbitrary-shaped nearby scene text detection.
A One-to-Many Training Scheme (OMTS) is designed to eliminate confusion and enable the proposals to learn more appropriate groundtruths.
We also propose a Proposal Feature Attention Module (PFAM) to exploit more effective features for each proposal.
arXiv Detail & Related papers (2021-09-08T06:25:37Z) - PAN++: Towards Efficient and Accurate End-to-End Spotting of
Arbitrarily-Shaped Text [85.7020597476857]
We propose an end-to-end text spotting framework, termed PAN++, which can efficiently detect and recognize text of arbitrary shapes in natural scenes.
PAN++ is based on the kernel representation that reformulates a text line as a text kernel (central region) surrounded by peripheral pixels.
As a pixel-based representation, the kernel representation can be predicted by a single fully convolutional network, which is very friendly to real-time applications.
arXiv Detail & Related papers (2021-05-02T07:04:30Z) - ContourNet: Taking a Further Step toward Accurate Arbitrary-shaped Scene
Text Detection [147.10751375922035]
We propose the ContourNet, which effectively handles false positives and large scale variance of scene texts.
Our method effectively suppresses these false positives by only outputting predictions with high response value in both directions.
arXiv Detail & Related papers (2020-04-10T08:15:23Z) - Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting [49.768327669098674]
We propose an end-to-end trainable text spotting approach named Text Perceptron.
It first employs an efficient segmentation-based text detector that learns the latent text reading order and boundary information.
Then a novel Shape Transform Module (abbr. STM) is designed to transform the detected feature regions into regular morphologies.
arXiv Detail & Related papers (2020-02-17T08:07:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.