Related papers: Fourier Document Restoration for Robust Document Dewarping and Recognition

Fourier Document Restoration for Robust Document Dewarping and Recognition

URL: http://arxiv.org/abs/2203.09910v1
Date: Fri, 18 Mar 2022 12:39:31 GMT
Title: Fourier Document Restoration for Robust Document Dewarping and Recognition
Authors: Chuhui Xue, Zichen Tian, Fangneng Zhan, Shijian Lu, Song Bai
Abstract summary: This paper presents FDRNet, a Fourier Document Restoration Network that can restore documents with different distortions. It dewarps documents by a flexible Thin-Plate Spline transformation which can handle various deformations effectively without requiring deformation annotations in training. It outperforms the state-of-the-art by large margins on both dewarping and text recognition tasks.
Score: 73.44057202891011
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: State-of-the-art document dewarping techniques learn to predict 3-dimensional information of documents which are prone to errors while dealing with documents with irregular distortions or large variations in depth. This paper presents FDRNet, a Fourier Document Restoration Network that can restore documents with different distortions and improve document recognition in a reliable and simpler manner. FDRNet focuses on high-frequency components in the Fourier space that capture most structural information but are largely free of degradation in appearance. It dewarps documents by a flexible Thin-Plate Spline transformation which can handle various deformations effectively without requiring deformation annotations in training. These features allow FDRNet to learn from a small amount of simply labeled training images, and the learned model can dewarp documents with complex geometric distortion and recognize the restored texts accurately. To facilitate document restoration research, we create a benchmark dataset consisting of over one thousand camera documents with different types of geometric and photometric distortion. Extensive experiments show that FDRNet outperforms the state-of-the-art by large margins on both dewarping and text recognition tasks. In addition, FDRNet requires a small amount of simply labeled training data and is easy to deploy.

Related papers

ForCenNet: Foreground-Centric Network for Document Image Rectification [36.95028425490806]
We introduce Foreground-Centric Network (ForCenNet) to eliminate geometric distortions in document images.<n>Extensive experiments demonstrate that ForCenNet achieves new state-of-the-art on four real-world benchmarks.
arXiv Detail & Related papers (2025-07-26T05:36:48Z)
Geometry Restoration and Dewarping of Camera-Captured Document Images [0.0]
This research focuses on developing a method for restoring the topology of digital images of paper documents captured by a camera. Our methodology employs deep learning (DL) for document outline detection, followed by computer vision (CV) to create a topological 2D grid.
arXiv Detail & Related papers (2025-01-06T17:12:19Z)
DocDiff: Document Enhancement via Residual Diffusion Models [7.972081359533047]
We propose DocDiff, a diffusion-based framework specifically designed for document enhancement problems. DocDiff consists of two modules: the Coarse Predictor (CP) and the High-Frequency Residual Refinement (HRR) module. Our proposed HRR module in pre-trained DocDiff is plug-and-play and ready-to-use, with only 4.17M parameters.
arXiv Detail & Related papers (2023-05-06T01:41:10Z)
DocMAE: Document Image Rectification via Self-supervised Representation Learning [144.44748607192147]
We present DocMAE, a novel self-supervised framework for document image rectification. We first mask random patches of the background-excluded document images and then reconstruct the missing pixels. With such a self-supervised learning approach, the network is encouraged to learn the intrinsic structure of deformed documents.
arXiv Detail & Related papers (2023-04-20T14:27:15Z)
Deep Unrestricted Document Image Rectification [110.61517455253308]
We present DocTr++, a novel unified framework for document image rectification. We upgrade the original architecture by adopting a hierarchical encoder-decoder structure for multi-scale representation extraction and parsing. We contribute a real-world test set and metrics applicable for evaluating the rectification quality.
arXiv Detail & Related papers (2023-04-18T08:00:54Z)
Geometric Rectification of Creased Document Images based on Isometric Mapping [0.0]
Geometric rectification of images of distorted documents finds wide applications in document digitization and Optical Character Recognition (OCR) We propose a general framework of document image rectification in which a computational isometric mapping model is utilized for expressing a 3D document model and its flattening in the plane. Experiments and comparisons to the state-of-the-art approaches demonstrated the effectiveness and outstanding performance of the proposed method.
arXiv Detail & Related papers (2022-12-16T09:33:31Z)
Boosting Modern and Historical Handwritten Text Recognition with Deformable Convolutions [52.250269529057014]
Handwritten Text Recognition (HTR) in free-volution pages is a challenging image understanding task. We propose to adopt deformable convolutions, which can deform depending on the input at hand and better adapt to the geometric variations of the text.
arXiv Detail & Related papers (2022-08-17T06:55:54Z)
DocScanner: Robust Document Image Rectification with Progressive Learning [162.03694280524084]
This work presents DocScanner, a new deep network architecture for document image rectification. DocScanner maintains a single estimate of the rectified image, which is progressively corrected with a recurrent architecture. The iterative refinements make DocScanner converge to a robust and superior performance, and the lightweight recurrent architecture ensures the running efficiency.
arXiv Detail & Related papers (2021-10-28T09:15:02Z)
Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network [30.18238229156996]
We propose a framework for both rectifying distorted document image and removing background finely, using a fully convolutional network (FCN) The FCN is trained by regressing displacements of synthesized distorted documents, and to control the smoothness of displacements, we propose a Local Smooth Constraint (LSC) in regularization. Experiments proved that our approach can dewarp document images effectively under various geometric distortions, and has achieved the state-of-the-art performance in terms of local details and overall effect.
arXiv Detail & Related papers (2021-04-14T12:32:36Z)
RectiNet-v2: A stacked network architecture for document image dewarping [16.249023269158734]
We propose an end-to-end CNN architecture that can produce distortion free document images from warped documents it takes as input. We train this model on warped document images simulated synthetically to compensate for lack of enough natural data. We evaluate our method on the DocUNet dataset, a benchmark in this domain, and obtain results comparable to state-of-the-art methods.
arXiv Detail & Related papers (2021-02-01T19:26:17Z)
Fast(er) Reconstruction of Shredded Text Documents via Self-Supervised Deep Asymmetric Metric Learning [62.34197797857823]
A central problem in automatic reconstruction of shredded documents is the pairwise compatibility evaluation of the shreds. This work proposes a scalable deep learning approach for measuring pairwise compatibility in which the number of inferences scales linearly. Our method has accuracy comparable to the state-of-the-art with a speed-up of about 22 times for a test instance with 505 shreds.
arXiv Detail & Related papers (2020-03-23T03:22:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.