A Survey on Deep learning based Document Image Enhancement
- URL: http://arxiv.org/abs/2112.02719v1
- Date: Mon, 6 Dec 2021 00:24:50 GMT
- Title: A Survey on Deep learning based Document Image Enhancement
- Authors: Zahra Anvari, Vassilis Athitsos
- Abstract summary: Digitized documents such as scientific articles, tax forms, invoices, contract papers, and historic texts are widely used nowadays.
These images could be degraded or damaged due to various reasons including poor lighting conditions when capturing the image, shadow while scanning them, distortion like noise and blur, aging, ink stain, bleed through, watermark, stamp, etc.
With recent advances in deep learning, many methods are proposed to enhance the quality of these document images.
- Score: 5.279475826661643
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Digitized documents such as scientific articles, tax forms, invoices,
contract papers, and historic texts, are widely used nowadays. These images
could be degraded or damaged due to various reasons including poor lighting
conditions when capturing the image, shadow while scanning them, distortion
like noise and blur, aging, ink stain, bleed through, watermark, stamp, etc.
Document image enhancement and restoration play a crucial role in many
automated document analysis and recognition tasks, such as content extraction
using optical character recognition (OCR). With recent advances in deep
learning, many methods are proposed to enhance the quality of these document
images. In this paper, we review deep learning-based methods, datasets, and
metrics for different document image enhancement problems. We provide a
comprehensive overview of deep learning-based methods for six different
document image enhancement tasks, including binarization, debluring, denoising,
defading, watermark removal, and shadow removal. We summarize the main
state-of-the-art works for each task and discuss their features, challenges,
and limitations. We introduce multiple document image enhancement tasks that
have received no to little attention, including over and under exposure
correction and bleed-through removal, and identify several other promising
research directions and opportunities for future research.
Related papers
- Unveiling Deep Shadows: A Survey on Image and Video Shadow Detection, Removal, and Generation in the Era of Deep Learning [81.15890262168449]
Shadows are formed when light encounters obstacles, leading to areas of diminished illumination.
In computer vision, shadow detection, removal, and generation are crucial for enhancing scene understanding, refining image quality, ensuring visual consistency in video editing, and improving virtual environments.
This paper presents a comprehensive survey of shadow detection, removal, and generation in images and videos within the deep learning landscape over the past decade, covering tasks, deep models, datasets, and evaluation metrics.
arXiv Detail & Related papers (2024-09-03T17:59:05Z) - Task-driven single-image super-resolution reconstruction of document scans [2.8391355909797644]
We investigate the possibility of employing super-resolution as a preprocessing step to improve optical character recognition from document scans.
To achieve that, we propose to train deep networks for single-image super-resolution in a task-driven way to make them better adapted for the purpose of text detection.
arXiv Detail & Related papers (2024-07-12T05:18:26Z) - DocMAE: Document Image Rectification via Self-supervised Representation
Learning [144.44748607192147]
We present DocMAE, a novel self-supervised framework for document image rectification.
We first mask random patches of the background-excluded document images and then reconstruct the missing pixels.
With such a self-supervised learning approach, the network is encouraged to learn the intrinsic structure of deformed documents.
arXiv Detail & Related papers (2023-04-20T14:27:15Z) - Deep Unrestricted Document Image Rectification [110.61517455253308]
We present DocTr++, a novel unified framework for document image rectification.
We upgrade the original architecture by adopting a hierarchical encoder-decoder structure for multi-scale representation extraction and parsing.
We contribute a real-world test set and metrics applicable for evaluating the rectification quality.
arXiv Detail & Related papers (2023-04-18T08:00:54Z) - Deep Image Matting: A Comprehensive Survey [85.77905619102802]
This paper presents a review of recent advancements in image matting in the era of deep learning.
We focus on two fundamental sub-tasks: auxiliary input-based image matting and automatic image matting.
We discuss relevant applications of image matting and highlight existing challenges and potential opportunities for future research.
arXiv Detail & Related papers (2023-04-10T15:48:55Z) - Hidden Knowledge: Mathematical Methods for the Extraction of the
Fingerprint of Medieval Paper from Digital Images [1.2891210250935146]
Medieval paper is made with a mould which leaves an indelible imprint on the sheet of paper.
This imprint includes chain lines, laid lines and watermarks which are often visible on the sheet.
Extracting these features allows the identification of paper stock and gives information about chronology, localisation and movement of books and people.
arXiv Detail & Related papers (2023-03-07T11:01:19Z) - EraseNet: A Recurrent Residual Network for Supervised Document Cleaning [0.0]
This paper introduces a supervised approach for cleaning dirty documents using a new fully convolutional auto-encoder architecture.
The experiments in this paper have shown promising results as the model is able to learn a variety of ordinary as well as unusual noises and rectify them efficiently.
arXiv Detail & Related papers (2022-10-03T04:23:25Z) - Augraphy: A Data Augmentation Library for Document Images [59.457999432618614]
Augraphy is a Python library for constructing data augmentation pipelines.
It provides strategies to produce augmented versions of clean document images that appear to have been altered by standard office operations.
arXiv Detail & Related papers (2022-08-30T22:36:19Z) - DocScanner: Robust Document Image Rectification with Progressive
Learning [162.03694280524084]
This work presents DocScanner, a new deep network architecture for document image rectification.
DocScanner maintains a single estimate of the rectified image, which is progressively corrected with a recurrent architecture.
The iterative refinements make DocScanner converge to a robust and superior performance, and the lightweight recurrent architecture ensures the running efficiency.
arXiv Detail & Related papers (2021-10-28T09:15:02Z) - Enhance to Read Better: An Improved Generative Adversarial Network for
Handwritten Document Image Enhancement [1.7491858164568674]
We propose an end to end architecture based on Generative Adversarial Networks (GANs) to recover degraded documents into a clean and readable form.
To the best of our knowledge, this is the first work to use the text information while binarizing handwritten documents.
We outperform the state of the art in H-DIBCO 2018 challenge, after fine tuning our pre-trained model with synthetically degraded Latin handwritten images.
arXiv Detail & Related papers (2021-05-26T17:44:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.