DE-GAN: A Conditional Generative Adversarial Network for Document
Enhancement
- URL: http://arxiv.org/abs/2010.08764v1
- Date: Sat, 17 Oct 2020 10:54:49 GMT
- Title: DE-GAN: A Conditional Generative Adversarial Network for Document
Enhancement
- Authors: Mohamed Ali Souibgui and Yousri Kessentini
- Abstract summary: We propose an end-to-end framework named Document Enhancement Geneversarative Adrial Networks (DE-GAN) to restore severely degraded document images.
We demonstrate that, in different tasks (document clean up, binarization, deblurring and watermark removal), DE-GAN can produce an enhanced version of the degraded document with a high quality.
- Score: 4.073826298938431
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Documents often exhibit various forms of degradation, which make it hard to
be read and substantially deteriorate the performance of an OCR system. In this
paper, we propose an effective end-to-end framework named Document Enhancement
Generative Adversarial Networks (DE-GAN) that uses the conditional GANs (cGANs)
to restore severely degraded document images. To the best of our knowledge,
this practice has not been studied within the context of generative adversarial
deep networks. We demonstrate that, in different tasks (document clean up,
binarization, deblurring and watermark removal), DE-GAN can produce an enhanced
version of the degraded document with a high quality. In addition, our approach
provides consistent improvements compared to state-of-the-art methods over the
widely used DIBCO 2013, DIBCO 2017 and H-DIBCO 2018 datasets, proving its
ability to restore a degraded document image to its ideal condition. The
obtained results on a wide variety of degradation reveal the flexibility of the
proposed model to be exploited in other document enhancement problems.
Related papers
- LayeredDoc: Domain Adaptive Document Restoration with a Layer Separation Approach [9.643486775455841]
This paper introduces a text-graphic layer separation approach that enhances domain adaptability in document image restoration systems.
We propose LayeredDoc, which utilizes two layers of information: the first targets coarse-grained graphic components, while the second refines machine-printed textual content.
We evaluate our approach both qualitatively and quantitatively using a new real-world dataset, LayeredDocDB, developed for this study.
arXiv Detail & Related papers (2024-06-12T19:41:01Z) - DocDiff: Document Enhancement via Residual Diffusion Models [7.972081359533047]
We propose DocDiff, a diffusion-based framework specifically designed for document enhancement problems.
DocDiff consists of two modules: the Coarse Predictor (CP) and the High-Frequency Residual Refinement (HRR) module.
Our proposed HRR module in pre-trained DocDiff is plug-and-play and ready-to-use, with only 4.17M parameters.
arXiv Detail & Related papers (2023-05-06T01:41:10Z) - DocMAE: Document Image Rectification via Self-supervised Representation
Learning [144.44748607192147]
We present DocMAE, a novel self-supervised framework for document image rectification.
We first mask random patches of the background-excluded document images and then reconstruct the missing pixels.
With such a self-supervised learning approach, the network is encouraged to learn the intrinsic structure of deformed documents.
arXiv Detail & Related papers (2023-04-20T14:27:15Z) - Deep Unrestricted Document Image Rectification [110.61517455253308]
We present DocTr++, a novel unified framework for document image rectification.
We upgrade the original architecture by adopting a hierarchical encoder-decoder structure for multi-scale representation extraction and parsing.
We contribute a real-world test set and metrics applicable for evaluating the rectification quality.
arXiv Detail & Related papers (2023-04-18T08:00:54Z) - EraseNet: A Recurrent Residual Network for Supervised Document Cleaning [0.0]
This paper introduces a supervised approach for cleaning dirty documents using a new fully convolutional auto-encoder architecture.
The experiments in this paper have shown promising results as the model is able to learn a variety of ordinary as well as unusual noises and rectify them efficiently.
arXiv Detail & Related papers (2022-10-03T04:23:25Z) - Document Image Binarization in JPEG Compressed Domain using Dual
Discriminator Generative Adversarial Networks [0.0]
The proposed model has been thoroughly tested with different versions of DIBCO dataset having challenges like holes, erased or smudged ink, dust, and misplaced fibres.
The model proved to be highly robust, efficient both in terms of time and space complexities, and also resulted in state-of-the-art performance in JPEG compressed domain.
arXiv Detail & Related papers (2022-09-13T12:07:32Z) - Enhance to Read Better: An Improved Generative Adversarial Network for
Handwritten Document Image Enhancement [1.7491858164568674]
We propose an end to end architecture based on Generative Adversarial Networks (GANs) to recover degraded documents into a clean and readable form.
To the best of our knowledge, this is the first work to use the text information while binarizing handwritten documents.
We outperform the state of the art in H-DIBCO 2018 challenge, after fine tuning our pre-trained model with synthetically degraded Latin handwritten images.
arXiv Detail & Related papers (2021-05-26T17:44:45Z) - Focused Attention Improves Document-Grounded Generation [111.42360617630669]
Document grounded generation is the task of using the information provided in a document to improve text generation.
This work focuses on two different document grounded generation tasks: Wikipedia Update Generation task and Dialogue response generation.
arXiv Detail & Related papers (2021-04-26T16:56:29Z) - Progressively Guided Alternate Refinement Network for RGB-D Salient
Object Detection [63.18846475183332]
We aim to develop an efficient and compact deep network for RGB-D salient object detection.
We propose a progressively guided alternate refinement network to refine it.
Our model outperforms existing state-of-the-art approaches by a large margin.
arXiv Detail & Related papers (2020-08-17T02:55:06Z) - Self-supervised Deep Reconstruction of Mixed Strip-shredded Text
Documents [63.41717168981103]
This work extends our previous deep learning method for single-page reconstruction to a more realistic/complex scenario.
In our approach, the compatibility evaluation is modeled as a two-class (valid or invalid) pattern recognition problem.
The proposed method outperforms the competing ones on complex scenarios, achieving accuracy superior to 90%.
arXiv Detail & Related papers (2020-07-01T21:48:05Z) - Improved Consistency Regularization for GANs [102.17007700413326]
We propose several modifications to the consistency regularization procedure designed to improve its performance.
For unconditional image synthesis on CIFAR-10 and CelebA, our modifications yield the best known FID scores on various GAN architectures.
On ImageNet-2012, we apply our technique to the original BigGAN model and improve the FID from 6.66 to 5.38, which is the best score at that model size.
arXiv Detail & Related papers (2020-02-11T22:53:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.