Related papers: Geometric Rectification of Creased Document Images based on Isometric Mapping

Geometric Rectification of Creased Document Images based on Isometric Mapping

URL: http://arxiv.org/abs/2212.08365v1
Date: Fri, 16 Dec 2022 09:33:31 GMT
Title: Geometric Rectification of Creased Document Images based on Isometric Mapping
Authors: Dong Luo and Pengbo Bo
Abstract summary: Geometric rectification of images of distorted documents finds wide applications in document digitization and Optical Character Recognition (OCR) We propose a general framework of document image rectification in which a computational isometric mapping model is utilized for expressing a 3D document model and its flattening in the plane. Experiments and comparisons to the state-of-the-art approaches demonstrated the effectiveness and outstanding performance of the proposed method.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Geometric rectification of images of distorted documents finds wide applications in document digitization and Optical Character Recognition (OCR). Although smoothly curved deformations have been widely investigated by many works, the most challenging distortions, e.g. complex creases and large foldings, have not been studied in particular. The performance of existing approaches, when applied to largely creased or folded documents, is far from satisfying, leaving substantial room for improvement. To tackle this task, knowledge about document rectification should be incorporated into the computation, among which the developability of 3D document models and particular textural features in the images, such as straight lines, are the most essential ones. For this purpose, we propose a general framework of document image rectification in which a computational isometric mapping model is utilized for expressing a 3D document model and its flattening in the plane. Based on this framework, both model developability and textural features are considered in the computation. The experiments and comparisons to the state-of-the-art approaches demonstrated the effectiveness and outstanding performance of the proposed method. Our method is also flexible in that the rectification results can be enhanced by any other methods that extract high-quality feature lines in the images.

Related papers

DocVCE: Diffusion-based Visual Counterfactual Explanations for Document Image Classification [5.247930659596986]
We introduce generative document counterfactuals that provide meaningful insights into the model's decision-making through actionable explanations.<n>To the best of the authors' knowledge, this is the first work to explore generative counterfactual explanations in document image analysis.
arXiv Detail & Related papers (2025-08-06T09:15:32Z)
DvD: Unleashing a Generative Paradigm for Document Dewarping via Coordinates-based Diffusion Model [25.504170988714783]
Document dewarping aims to rectify deformations in photographic document images, thus improving text readability.<n>We propose DvD, the first generative model to tackle document textbfDewarping textbfvia a textbfDiffusion framework.
arXiv Detail & Related papers (2025-05-28T05:05:51Z)
A Guide to Structureless Visual Localization [63.41481414949785]
Methods that estimate the camera pose of a query image in a known scene are core components of many applications, including self-driving cars and augmented / mixed reality systems. State-of-the-art visual localization algorithms are structure-based, i.e., they store a 3D model of the scene and use 2D-3D correspondences between the query image and 3D points in the model for camera pose estimation. This paper is dedicated to providing, to the best of our knowledge, first comprehensive discussion and comparison of structureless methods.
arXiv Detail & Related papers (2025-04-24T15:08:36Z)
Geometry Restoration and Dewarping of Camera-Captured Document Images [0.0]
This research focuses on developing a method for restoring the topology of digital images of paper documents captured by a camera. Our methodology employs deep learning (DL) for document outline detection, followed by computer vision (CV) to create a topological 2D grid.
arXiv Detail & Related papers (2025-01-06T17:12:19Z)
TPIE: Topology-Preserved Image Editing With Text Instructions [14.399084325078878]
Topology-Preserved Image Editing with text instructions (TPIE) TPIE treats newly generated samples as deformable variations of a given input template, allowing for controllable and structure-preserving edits. We validate TPIE on a diverse set of 2D and 3D images and compare them with state-of-the-art image editing approaches.
arXiv Detail & Related papers (2024-11-22T22:08:27Z)
Embedded Shape Matching in Photogrammetry Data for Modeling Making Knowledge [0.0]
We use two-dimensional samples obtained by projection to overcome the difficulties of pattern recognition in three-dimensional models. The application is based on photogrammetric capture of a few examples of Zeugma mosaics and three-dimensional digital modeling of a set of Seljuk era brick walls.
arXiv Detail & Related papers (2023-12-20T23:52:53Z)
Wonder3D: Single Image to 3D using Cross-Domain Diffusion [105.16622018766236]
Wonder3D is a novel method for efficiently generating high-fidelity textured meshes from single-view images. To holistically improve the quality, consistency, and efficiency of image-to-3D tasks, we propose a cross-domain diffusion model.
arXiv Detail & Related papers (2023-10-23T15:02:23Z)
IT3D: Improved Text-to-3D Generation with Explicit View Synthesis [71.68595192524843]
This study presents a novel strategy that leverages explicitly synthesized multi-view images to address these issues. Our approach involves the utilization of image-to-image pipelines, empowered by LDMs, to generate posed high-quality images. For the incorporated discriminator, the synthesized multi-view images are considered real data, while the renderings of the optimized 3D models function as fake data.
arXiv Detail & Related papers (2023-08-22T14:39:17Z)
Geometric Representation Learning for Document Image Rectification [137.75133384124976]
We present DocGeoNet for document image rectification by introducing explicit geometric representation. Our motivation arises from the insight that 3D shape provides global unwarping cues for rectifying a distorted document image. Experiments show the effectiveness of our framework and demonstrate the superiority of our framework over state-of-the-art methods.
arXiv Detail & Related papers (2022-10-15T01:57:40Z)
Fourier Document Restoration for Robust Document Dewarping and Recognition [73.44057202891011]
This paper presents FDRNet, a Fourier Document Restoration Network that can restore documents with different distortions. It dewarps documents by a flexible Thin-Plate Spline transformation which can handle various deformations effectively without requiring deformation annotations in training. It outperforms the state-of-the-art by large margins on both dewarping and text recognition tasks.
arXiv Detail & Related papers (2022-03-18T12:39:31Z)
Geometric Processing for Image-based 3D Object Modeling [2.6397379133308214]
This article focuses on introducing the state-of-the-art methods of three major components of geometric processing: 1) geo-referencing; 2) Image dense matching 3) texture mapping. The largely automated geometric processing of images in a 3D object reconstruction workflow, is becoming a critical part of the reality-based 3D modeling.
arXiv Detail & Related papers (2021-06-27T18:33:30Z)
Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network [30.18238229156996]
We propose a framework for both rectifying distorted document image and removing background finely, using a fully convolutional network (FCN) The FCN is trained by regressing displacements of synthesized distorted documents, and to control the smoothness of displacements, we propose a Local Smooth Constraint (LSC) in regularization. Experiments proved that our approach can dewarp document images effectively under various geometric distortions, and has achieved the state-of-the-art performance in terms of local details and overall effect.
arXiv Detail & Related papers (2021-04-14T12:32:36Z)
Wide-angle Image Rectification: A Survey [86.36118799330802]
wide-angle images contain distortions that violate the assumptions underlying pinhole camera models. Image rectification, which aims to correct these distortions, can solve these problems. We present a detailed description and discussion of the camera models used in different approaches. Next, we review both traditional geometry-based image rectification methods and deep learning-based methods.
arXiv Detail & Related papers (2020-10-30T17:28:40Z)
Self-supervised Deep Reconstruction of Mixed Strip-shredded Text Documents [63.41717168981103]
This work extends our previous deep learning method for single-page reconstruction to a more realistic/complex scenario. In our approach, the compatibility evaluation is modeled as a two-class (valid or invalid) pattern recognition problem. The proposed method outperforms the competing ones on complex scenarios, achieving accuracy superior to 90%.
arXiv Detail & Related papers (2020-07-01T21:48:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.