A Gated and Bifurcated Stacked U-Net Module for Document Image Dewarping
- URL: http://arxiv.org/abs/2007.09824v1
- Date: Mon, 20 Jul 2020 01:22:05 GMT
- Title: A Gated and Bifurcated Stacked U-Net Module for Document Image Dewarping
- Authors: Hmrishav Bandyopadhyay, Tanmoy Dasgupta, Nibaran Das, Mita Nasipuri
- Abstract summary: We propose a supervised Gated and Bifurcated Stacked U-Net module to predict a dewarping grid and create a distortion free image from the input.
The novelty in our methods exists not only in a bifurcation of the U-Net to help eliminate the intermingling of the grid coordinates, but also in the use of a gated network which adds boundary and other minute line level details to the model.
- Score: 20.591737450565855
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Capturing images of documents is one of the easiest and most used methods of
recording them. These images however, being captured with the help of handheld
devices, often lead to undesirable distortions that are hard to remove. We
propose a supervised Gated and Bifurcated Stacked U-Net module to predict a
dewarping grid and create a distortion free image from the input. While the
network is trained on synthetically warped document images, results are
calculated on the basis of real world images. The novelty in our methods exists
not only in a bifurcation of the U-Net to help eliminate the intermingling of
the grid coordinates, but also in the use of a gated network which adds
boundary and other minute line level details to the model. The end-to-end
pipeline proposed by us achieves state-of-the-art performance on the DocUNet
dataset after being trained on just 8 percent of the data used in previous
methods.
Related papers
- Block and Detail: Scaffolding Sketch-to-Image Generation [65.56590359051634]
We introduce a novel sketch-to-image tool that aligns with the iterative refinement process of artists.
Our tool lets users sketch blocking strokes to coarsely represent the placement and form of objects and detail strokes to refine their shape and silhouettes.
We develop a two-pass algorithm for generating high-fidelity images from such sketches at any point in the iterative process.
arXiv Detail & Related papers (2024-02-28T07:09:31Z) - iEdit: Localised Text-guided Image Editing with Weak Supervision [53.082196061014734]
We propose a novel learning method for text-guided image editing.
It generates images conditioned on a source image and a textual edit prompt.
It shows favourable results against its counterparts in terms of image fidelity, CLIP alignment score and qualitatively for editing both generated and real images.
arXiv Detail & Related papers (2023-05-10T07:39:14Z) - DocMAE: Document Image Rectification via Self-supervised Representation
Learning [144.44748607192147]
We present DocMAE, a novel self-supervised framework for document image rectification.
We first mask random patches of the background-excluded document images and then reconstruct the missing pixels.
With such a self-supervised learning approach, the network is encouraged to learn the intrinsic structure of deformed documents.
arXiv Detail & Related papers (2023-04-20T14:27:15Z) - Deep Unrestricted Document Image Rectification [110.61517455253308]
We present DocTr++, a novel unified framework for document image rectification.
We upgrade the original architecture by adopting a hierarchical encoder-decoder structure for multi-scale representation extraction and parsing.
We contribute a real-world test set and metrics applicable for evaluating the rectification quality.
arXiv Detail & Related papers (2023-04-18T08:00:54Z) - UVDoc: Neural Grid-based Document Unwarping [20.51368640747448]
Restoring the original, flat appearance of a printed document from casual photographs is a common everyday problem.
We propose a novel method for grid-based single-image document unwarping.
Our method performs geometric distortion correction via a fully convolutional deep neural network.
arXiv Detail & Related papers (2023-02-06T15:53:34Z) - SISL:Self-Supervised Image Signature Learning for Splicing Detection and
Localization [11.437760125881049]
We propose self-supervised approach for training splicing detection/localization models from frequency transforms of images.
Our proposed model can yield similar or better performances on standard datasets without relying on labels or metadata.
arXiv Detail & Related papers (2022-03-15T12:26:29Z) - Inverse Problems Leveraging Pre-trained Contrastive Representations [88.70821497369785]
We study a new family of inverse problems for recovering representations of corrupted data.
We propose a supervised inversion method that uses a contrastive objective to obtain excellent representations for highly corrupted images.
Our method outperforms end-to-end baselines even with a fraction of the labeled data in a wide range of forward operators.
arXiv Detail & Related papers (2021-10-14T15:06:30Z) - RectiNet-v2: A stacked network architecture for document image dewarping [16.249023269158734]
We propose an end-to-end CNN architecture that can produce distortion free document images from warped documents it takes as input.
We train this model on warped document images simulated synthetically to compensate for lack of enough natural data.
We evaluate our method on the DocUNet dataset, a benchmark in this domain, and obtain results comparable to state-of-the-art methods.
arXiv Detail & Related papers (2021-02-01T19:26:17Z) - Multiple Document Datasets Pre-training Improves Text Line Detection
With Deep Neural Networks [2.5352713493505785]
We introduce a fully convolutional network for the document layout analysis task.
Our method Doc-UFCN relies on a U-shaped model trained from scratch for detecting objects from historical documents.
We show that Doc-UFCN outperforms state-of-the-art methods on various datasets.
arXiv Detail & Related papers (2020-12-28T09:48:33Z) - Wavelet-Based Dual-Branch Network for Image Demoireing [148.91145614517015]
We design a wavelet-based dual-branch network (WDNet) with a spatial attention mechanism for image demoireing.
Our network removes moire patterns in the wavelet domain to separate the frequencies of moire patterns from the image content.
Experiments demonstrate the effectiveness of our method, and we further show that WDNet generalizes to removing moire artifacts on non-screen images.
arXiv Detail & Related papers (2020-07-14T16:44:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.