Multistage Curvilinear Coordinate Transform Based Document Image
Dewarping using a Novel Quality Estimator
- URL: http://arxiv.org/abs/2003.06872v1
- Date: Sun, 15 Mar 2020 17:17:53 GMT
- Title: Multistage Curvilinear Coordinate Transform Based Document Image
Dewarping using a Novel Quality Estimator
- Authors: Tanmoy Dasgupta and Nibaran Das and Mita Nasipuri
- Abstract summary: The present work demonstrates a fast and improved technique for dewarping nonlinearly warped document images.
The images are first dewarped at the page-level by estimating optimum inverse projections using curvilinear homography.
The quality of the process is then estimated by evaluating a set of metrics related to the characteristics of the text lines and rectilinear objects.
If the quality is estimated to be unsatisfactory, the page-level dewarping process is repeated with finer approximations.
This is followed by a line-level dewarping process that makes granular corrections to the warps in individual text-lines.
- Score: 11.342730352935913
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The present work demonstrates a fast and improved technique for dewarping
nonlinearly warped document images. The images are first dewarped at the
page-level by estimating optimum inverse projections using curvilinear
homography. The quality of the process is then estimated by evaluating a set of
metrics related to the characteristics of the text lines and rectilinear
objects for measuring parallelism, orthogonality, etc. These are designed
specifically to estimate the quality of the dewarping process without the need
of any ground truth. If the quality is estimated to be unsatisfactory, the
page-level dewarping process is repeated with finer approximations. This is
followed by a line-level dewarping process that makes granular corrections to
the warps in individual text-lines. The methodology has been tested on the
CBDAR 2007 / IUPR 2011 document image dewarping dataset and is seen to yield
the best OCR accuracy in the shortest amount of time, till date. The usefulness
of the methodology has also been evaluated on the DocUNet 2018 dataset with
some minor tweaks, and is seen to produce comparable results.
Related papers
- C-TPT: Calibrated Test-Time Prompt Tuning for Vision-Language Models via Text Feature Dispersion [54.81141583427542]
In deep learning, test-time adaptation has gained attention as a method for model fine-tuning without the need for labeled data.
This paper explores calibration during test-time prompt tuning by leveraging the inherent properties of CLIP.
We present a novel method, Calibrated Test-time Prompt Tuning (C-TPT), for optimizing prompts during test-time with enhanced calibration.
arXiv Detail & Related papers (2024-03-21T04:08:29Z) - Corner-to-Center Long-range Context Model for Efficient Learned Image
Compression [70.0411436929495]
In the framework of learned image compression, the context model plays a pivotal role in capturing the dependencies among latent representations.
We propose the textbfCorner-to-Center transformer-based Context Model (C$3$M) designed to enhance context and latent predictions.
In addition, to enlarge the receptive field in the analysis and synthesis transformation, we use the Long-range Crossing Attention Module (LCAM) in the encoder/decoder.
arXiv Detail & Related papers (2023-11-29T21:40:28Z) - End-to-End Page-Level Assessment of Handwritten Text Recognition [69.55992406968495]
HTR systems increasingly face the end-to-end page-level transcription of a document.
Standard metrics do not take into account the inconsistencies that might appear.
We propose a two-fold evaluation, where the transcription accuracy and the RO goodness are considered separately.
arXiv Detail & Related papers (2023-01-14T15:43:07Z) - Revisiting Document Image Dewarping by Grid Regularization [41.87305384805975]
This paper addresses the problem of document image dewarping.
We take the text lines and the document boundaries into account from a constrained optimization perspective.
Our proposed method first learns the boundary points and the pixels in the text lines.
arXiv Detail & Related papers (2022-03-31T07:18:30Z) - Fast Hybrid Image Retargeting [0.0]
We propose a method that quantifies and limits warping distortions with the use of content-aware cropping.
Our method outperforms recent approaches, while running in a fraction of their execution time.
arXiv Detail & Related papers (2022-03-25T11:46:06Z) - DocScanner: Robust Document Image Rectification with Progressive
Learning [162.03694280524084]
This work presents DocScanner, a new deep network architecture for document image rectification.
DocScanner maintains a single estimate of the rectified image, which is progressively corrected with a recurrent architecture.
The iterative refinements make DocScanner converge to a robust and superior performance, and the lightweight recurrent architecture ensures the running efficiency.
arXiv Detail & Related papers (2021-10-28T09:15:02Z) - DocTr: Document Image Transformer for Geometric Unwarping and
Illumination Correction [99.09177377916369]
We propose Document Image Transformer (DocTr) to address the issue of geometry and illumination distortion of the document images.
Our DocTr achieves 20.02% Character Error Rate (CER), a 15% absolute improvement over the state-of-the-art methods.
arXiv Detail & Related papers (2021-10-25T13:27:10Z) - Automatic Extrinsic Calibration Method for LiDAR and Camera Sensor
Setups [68.8204255655161]
We present a method to calibrate the parameters of any pair of sensors involving LiDARs, monocular or stereo cameras.
The proposed approach can handle devices with very different resolutions and poses, as usually found in vehicle setups.
arXiv Detail & Related papers (2021-01-12T12:02:26Z) - Multiple Document Datasets Pre-training Improves Text Line Detection
With Deep Neural Networks [2.5352713493505785]
We introduce a fully convolutional network for the document layout analysis task.
Our method Doc-UFCN relies on a U-shaped model trained from scratch for detecting objects from historical documents.
We show that Doc-UFCN outperforms state-of-the-art methods on various datasets.
arXiv Detail & Related papers (2020-12-28T09:48:33Z) - Subjective Annotation for a Frame Interpolation Benchmark using Artefact
Amplification [6.544757635738911]
For image quality assessment, the actual quality experienced by the user cannot be fully deduced from simple measures.
We conducted a subjective quality assessment crowdscouring study for the interpolated frames provided by one of the optical flow benchmarks.
As a first step, we proposed such a new full-reference method, called WAE-IQA.
arXiv Detail & Related papers (2020-01-10T18:20:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.