Can You Read Me Now? Content Aware Rectification using Angle Supervision
- URL: http://arxiv.org/abs/2008.02231v1
- Date: Wed, 5 Aug 2020 16:58:13 GMT
- Title: Can You Read Me Now? Content Aware Rectification using Angle Supervision
- Authors: Amir Markovitz, Inbal Lavi, Or Perel, Shai Mazor and Roee Litman
- Abstract summary: We present CREASE: Content Aware Rectification using Angle Supervision, the first learned method for document rectification.
Our method surpasses previous approaches in terms of OCR accuracy, geometric error and visual similarity.
- Score: 14.095728009592763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ubiquity of smartphone cameras has led to more and more documents being
captured by cameras rather than scanned. Unlike flatbed scanners, photographed
documents are often folded and crumpled, resulting in large local variance in
text structure. The problem of document rectification is fundamental to the
Optical Character Recognition (OCR) process on documents, and its ability to
overcome geometric distortions significantly affects recognition accuracy.
Despite the great progress in recent OCR systems, most still rely on a
pre-process that ensures the text lines are straight and axis aligned. Recent
works have tackled the problem of rectifying document images taken in-the-wild
using various supervision signals and alignment means. However, they focused on
global features that can be extracted from the document's boundaries, ignoring
various signals that could be obtained from the document's content.
We present CREASE: Content Aware Rectification using Angle Supervision, the
first learned method for document rectification that relies on the document's
content, the location of the words and specifically their orientation, as hints
to assist in the rectification process. We utilize a novel pixel-wise angle
regression approach and a curvature estimation side-task for optimizing our
rectification model. Our method surpasses previous approaches in terms of OCR
accuracy, geometric error and visual similarity.
Related papers
- DocMAE: Document Image Rectification via Self-supervised Representation
Learning [144.44748607192147]
We present DocMAE, a novel self-supervised framework for document image rectification.
We first mask random patches of the background-excluded document images and then reconstruct the missing pixels.
With such a self-supervised learning approach, the network is encouraged to learn the intrinsic structure of deformed documents.
arXiv Detail & Related papers (2023-04-20T14:27:15Z) - Deep Unrestricted Document Image Rectification [110.61517455253308]
We present DocTr++, a novel unified framework for document image rectification.
We upgrade the original architecture by adopting a hierarchical encoder-decoder structure for multi-scale representation extraction and parsing.
We contribute a real-world test set and metrics applicable for evaluating the rectification quality.
arXiv Detail & Related papers (2023-04-18T08:00:54Z) - UDoc-GAN: Unpaired Document Illumination Correction with Background
Light Prior [128.19212716007794]
UDoc-GAN is first framework to address the problem of document illumination correction under the unpaired setting.
We first predict the ambient light features of the document.
Then, according to the characteristics of different level of ambient lights, we re-formulate the cycle consistency constraint.
Compared with the state-of-the-art approaches, our method demonstrates promising performance in terms of character error rate (CER) and edit distance (ED)
arXiv Detail & Related papers (2022-10-15T07:19:23Z) - Document Dewarping with Control Points [36.32190493389662]
We propose a simple yet effective approach to rectify distorted document image by estimating control points and reference points.
Control points are controllable to facilitate interaction or subsequent adjustment.
Experiments show that our approach can rectify document images with various distortion types, and yield state-of-the-art performance on real-world dataset.
arXiv Detail & Related papers (2022-03-20T12:51:14Z) - DocScanner: Robust Document Image Rectification with Progressive
Learning [162.03694280524084]
This work presents DocScanner, a new deep network architecture for document image rectification.
DocScanner maintains a single estimate of the rectified image, which is progressively corrected with a recurrent architecture.
The iterative refinements make DocScanner converge to a robust and superior performance, and the lightweight recurrent architecture ensures the running efficiency.
arXiv Detail & Related papers (2021-10-28T09:15:02Z) - DocTr: Document Image Transformer for Geometric Unwarping and
Illumination Correction [99.09177377916369]
We propose Document Image Transformer (DocTr) to address the issue of geometry and illumination distortion of the document images.
Our DocTr achieves 20.02% Character Error Rate (CER), a 15% absolute improvement over the state-of-the-art methods.
arXiv Detail & Related papers (2021-10-25T13:27:10Z) - Dewarping Document Image By Displacement Flow Estimation with Fully
Convolutional Network [30.18238229156996]
We propose a framework for both rectifying distorted document image and removing background finely, using a fully convolutional network (FCN)
The FCN is trained by regressing displacements of synthesized distorted documents, and to control the smoothness of displacements, we propose a Local Smooth Constraint (LSC) in regularization.
Experiments proved that our approach can dewarp document images effectively under various geometric distortions, and has achieved the state-of-the-art performance in terms of local details and overall effect.
arXiv Detail & Related papers (2021-04-14T12:32:36Z) - Fast(er) Reconstruction of Shredded Text Documents via Self-Supervised
Deep Asymmetric Metric Learning [62.34197797857823]
A central problem in automatic reconstruction of shredded documents is the pairwise compatibility evaluation of the shreds.
This work proposes a scalable deep learning approach for measuring pairwise compatibility in which the number of inferences scales linearly.
Our method has accuracy comparable to the state-of-the-art with a speed-up of about 22 times for a test instance with 505 shreds.
arXiv Detail & Related papers (2020-03-23T03:22:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.