TextRay: Contour-based Geometric Modeling for Arbitrary-shaped Scene
Text Detection
- URL: http://arxiv.org/abs/2008.04851v2
- Date: Wed, 12 Aug 2020 07:29:25 GMT
- Title: TextRay: Contour-based Geometric Modeling for Arbitrary-shaped Scene
Text Detection
- Authors: Fangfang Wang, Yifeng Chen, Fei Wu, and Xi Li
- Abstract summary: We propose an arbitrary-shaped text detection method, namely TextRay, which conducts top-down contour-based geometric modeling and geometric parameter learning.
Experiments on several benchmark datasets demonstrate the effectiveness of the proposed approach.
- Score: 20.34326396800748
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Arbitrary-shaped text detection is a challenging task due to the complex
geometric layouts of texts such as large aspect ratios, various scales, random
rotations and curve shapes. Most state-of-the-art methods solve this problem
from bottom-up perspectives, seeking to model a text instance of complex
geometric layouts with simple local units (e.g., local boxes or pixels) and
generate detections with heuristic post-processings. In this work, we propose
an arbitrary-shaped text detection method, namely TextRay, which conducts
top-down contour-based geometric modeling and geometric parameter learning
within a single-shot anchor-free framework. The geometric modeling is carried
out under polar system with a bidirectional mapping scheme between shape space
and parameter space, encoding complex geometric layouts into unified
representations. For effective learning of the representations, we design a
central-weighted training strategy and a content loss which builds propagation
paths between geometric encodings and visual content. TextRay outputs simple
polygon detections at one pass with only one NMS post-processing. Experiments
on several benchmark datasets demonstrate the effectiveness of the proposed
approach. The code is available at https://github.com/LianaWang/TextRay.
Related papers
- G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images [45.66479596827045]
We propose a Geometry-enhanced NeRF (G-NeRF), which seeks to enhance the geometry priors by a geometry-guided multi-view synthesis approach.
To tackle the absence of multi-view supervision for single-view images, we design the depth-aware training approach.
arXiv Detail & Related papers (2024-04-11T04:58:18Z) - Adaptive Surface Normal Constraint for Geometric Estimation from Monocular Images [56.86175251327466]
We introduce a novel approach to learn geometries such as depth and surface normal from images while incorporating geometric context.
Our approach extracts geometric context that encodes the geometric variations present in the input image and correlates depth estimation with geometric constraints.
Our method unifies depth and surface normal estimations within a cohesive framework, which enables the generation of high-quality 3D geometry from images.
arXiv Detail & Related papers (2024-02-08T17:57:59Z) - Geometrically Consistent Partial Shape Matching [50.29468769172704]
Finding correspondences between 3D shapes is a crucial problem in computer vision and graphics.
An often neglected but essential property of matching geometrics is consistency.
We propose a novel integer linear programming partial shape matching formulation.
arXiv Detail & Related papers (2023-09-10T12:21:42Z) - LRANet: Towards Accurate and Efficient Scene Text Detection with
Low-Rank Approximation Network [63.554061288184165]
We propose a novel parameterized text shape method based on low-rank approximation.
By exploring the shape correlation among different text contours, our method achieves consistency, compactness, simplicity, and robustness in shape representation.
We implement an accurate and efficient arbitrary-shaped text detector named LRANet.
arXiv Detail & Related papers (2023-06-27T02:03:46Z) - Geometric Representation Learning for Document Image Rectification [137.75133384124976]
We present DocGeoNet for document image rectification by introducing explicit geometric representation.
Our motivation arises from the insight that 3D shape provides global unwarping cues for rectifying a distorted document image.
Experiments show the effectiveness of our framework and demonstrate the superiority of our framework over state-of-the-art methods.
arXiv Detail & Related papers (2022-10-15T01:57:40Z) - Fitting and recognition of geometric primitives in segmented 3D point
clouds using a localized voting procedure [1.8352113484137629]
We introduce a novel technique for processing point clouds that, through a voting procedure, is able to provide an initial estimate of the primitive parameters each type.
By using these estimates we localize the search of the optimal solution in a dimensionally-reduced space, making it efficient to extend the HT to more primitive than those that generally found in the literature.
arXiv Detail & Related papers (2022-05-30T20:47:43Z) - PVSeRF: Joint Pixel-, Voxel- and Surface-Aligned Radiance Field for
Single-Image Novel View Synthesis [52.546998369121354]
We present PVSeRF, a learning framework that reconstructs neural radiance fields from single-view RGB images.
We propose to incorporate explicit geometry reasoning and combine it with pixel-aligned features for radiance field prediction.
We show that the introduction of such geometry-aware features helps to achieve a better disentanglement between appearance and geometry.
arXiv Detail & Related papers (2022-02-10T07:39:47Z) - Hybrid Approach for 3D Head Reconstruction: Using Neural Networks and
Visual Geometry [3.970492757288025]
We present a novel method for reconstructing 3D heads from a single or multiple image(s) using a hybrid approach based on deep learning and geometric techniques.
We propose an encoder-decoder network based on the U-net architecture and trained on synthetic data only.
arXiv Detail & Related papers (2021-04-28T11:31:35Z) - Deep Geometric Texture Synthesis [83.9404865744028]
We propose a novel framework for synthesizing geometric textures.
It learns texture statistics from local neighborhoods of a single reference 3D model.
Our network displaces mesh vertices in any direction, enabling synthesis of geometric textures.
arXiv Detail & Related papers (2020-06-30T19:36:38Z) - Deep Relational Reasoning Graph Network for Arbitrary Shape Text
Detection [20.244378408779554]
We propose a novel unified relational reasoning graph network for arbitrary shape text detection.
An innovative local graph bridges a text proposal model via CNN and a deep relational reasoning network via Graph Convolutional Network (GCN)
Experiments on public available datasets demonstrate the state-of-the-art performance of our method.
arXiv Detail & Related papers (2020-03-17T01:50:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.