Geometric Representation Learning for Document Image Rectification
- URL: http://arxiv.org/abs/2210.08161v1
- Date: Sat, 15 Oct 2022 01:57:40 GMT
- Title: Geometric Representation Learning for Document Image Rectification
- Authors: Hao Feng, Wengang Zhou, Jiajun Deng, Yuechen Wang and Houqiang Li
- Abstract summary: We present DocGeoNet for document image rectification by introducing explicit geometric representation.
Our motivation arises from the insight that 3D shape provides global unwarping cues for rectifying a distorted document image.
Experiments show the effectiveness of our framework and demonstrate the superiority of our framework over state-of-the-art methods.
- Score: 137.75133384124976
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In document image rectification, there exist rich geometric constraints
between the distorted image and the ground truth one. However, such geometric
constraints are largely ignored in existing advanced solutions, which limits
the rectification performance. To this end, we present DocGeoNet for document
image rectification by introducing explicit geometric representation.
Technically, two typical attributes of the document image are involved in the
proposed geometric representation learning, i.e., 3D shape and textlines. Our
motivation arises from the insight that 3D shape provides global unwarping cues
for rectifying a distorted document image while overlooking the local
structure. On the other hand, textlines complementarily provide explicit
geometric constraints for local patterns. The learned geometric representation
effectively bridges the distorted image and the ground truth one. Extensive
experiments show the effectiveness of our framework and demonstrate the
superiority of our DocGeoNet over state-of-the-art methods on both the DocUNet
Benchmark dataset and our proposed DIR300 test set. The code is available at
https://github.com/fh2019ustc/DocGeoNet.
Related papers
- TPIE: Topology-Preserved Image Editing With Text Instructions [14.399084325078878]
Topology-Preserved Image Editing with text instructions (TPIE)
TPIE treats newly generated samples as deformable variations of a given input template, allowing for controllable and structure-preserving edits.
We validate TPIE on a diverse set of 2D and 3D images and compare them with state-of-the-art image editing approaches.
arXiv Detail & Related papers (2024-11-22T22:08:27Z) - Adaptive Surface Normal Constraint for Geometric Estimation from Monocular Images [56.86175251327466]
We introduce a novel approach to learn geometries such as depth and surface normal from images while incorporating geometric context.
Our approach extracts geometric context that encodes the geometric variations present in the input image and correlates depth estimation with geometric constraints.
Our method unifies depth and surface normal estimations within a cohesive framework, which enables the generation of high-quality 3D geometry from images.
arXiv Detail & Related papers (2024-02-08T17:57:59Z) - DocMAE: Document Image Rectification via Self-supervised Representation
Learning [144.44748607192147]
We present DocMAE, a novel self-supervised framework for document image rectification.
We first mask random patches of the background-excluded document images and then reconstruct the missing pixels.
With such a self-supervised learning approach, the network is encouraged to learn the intrinsic structure of deformed documents.
arXiv Detail & Related papers (2023-04-20T14:27:15Z) - Deep Unrestricted Document Image Rectification [110.61517455253308]
We present DocTr++, a novel unified framework for document image rectification.
We upgrade the original architecture by adopting a hierarchical encoder-decoder structure for multi-scale representation extraction and parsing.
We contribute a real-world test set and metrics applicable for evaluating the rectification quality.
arXiv Detail & Related papers (2023-04-18T08:00:54Z) - UVDoc: Neural Grid-based Document Unwarping [20.51368640747448]
Restoring the original, flat appearance of a printed document from casual photographs is a common everyday problem.
We propose a novel method for grid-based single-image document unwarping.
Our method performs geometric distortion correction via a fully convolutional deep neural network.
arXiv Detail & Related papers (2023-02-06T15:53:34Z) - Geometric Rectification of Creased Document Images based on Isometric
Mapping [0.0]
Geometric rectification of images of distorted documents finds wide applications in document digitization and Optical Character Recognition (OCR)
We propose a general framework of document image rectification in which a computational isometric mapping model is utilized for expressing a 3D document model and its flattening in the plane.
Experiments and comparisons to the state-of-the-art approaches demonstrated the effectiveness and outstanding performance of the proposed method.
arXiv Detail & Related papers (2022-12-16T09:33:31Z) - Self-Supervised Image Representation Learning with Geometric Set
Consistency [50.12720780102395]
We propose a method for self-supervised image representation learning under the guidance of 3D geometric consistency.
Specifically, we introduce 3D geometric consistency into a contrastive learning framework to enforce the feature consistency within image views.
arXiv Detail & Related papers (2022-03-29T08:57:33Z) - Joint Deep Multi-Graph Matching and 3D Geometry Learning from
Inhomogeneous 2D Image Collections [57.60094385551773]
We propose a trainable framework for learning a deformable 3D geometry model from inhomogeneous image collections.
We in addition obtain the underlying 3D geometry of the objects depicted in the 2D images.
arXiv Detail & Related papers (2021-03-31T17:25:36Z) - TextRay: Contour-based Geometric Modeling for Arbitrary-shaped Scene
Text Detection [20.34326396800748]
We propose an arbitrary-shaped text detection method, namely TextRay, which conducts top-down contour-based geometric modeling and geometric parameter learning.
Experiments on several benchmark datasets demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2020-08-11T16:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.