Three-stage binarization of color document images based on discrete wavelet transform and generative adversarial networks
- URL: http://arxiv.org/abs/2211.16098v7
- Date: Fri, 14 Jun 2024 09:14:05 GMT
- Title: Three-stage binarization of color document images based on discrete wavelet transform and generative adversarial networks
- Authors: Rui-Yang Ju, Yu-Shian Lin, Yanlin Jin, Chih-Chia Chen, Chun-Tse Chien, Jen-Shiun Chiang,
- Abstract summary: This work proposes a three-stage method to generate binarization image results for degraded colour document images using generative adversarial networks (GANs)
The experimental results show that the Avg-Score of the proposed method is 77.64, 77.95, 79.05, 76.38, 75.34, and 77.00 on the (H)-DIBCO 2011, 2013, 2014, 2016, 2017, and 2018 datasets, which achieves the state-of-theart level.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The efficient segmentation of text information from the background in degraded color document images is an important challenge in the preservation of ancient manuscripts. The imperfect preservation of ancient manuscripts has led to various types of degradation over time, such as staining, yellowing, and ink seepage, badly affecting document image binarization results. This work proposes a three-stage method to generate binarization image results for degraded colour document images using generative adversarial networks (GANs). Stage-1 involves applying discrete wavelet transform and retaining the low-low subband images for document image enhancement. In Stage-2, the original input image is split into red, green, and blue (RGB) three single-channel images and one grayscale image, and each image is trained with independent GANs to extract color foreground information. In Stage-3, the output images of Stage-2 and the resized input images are used to train independent GANs to generate document binarization results, enabling the combination of global and local features. The experimental results show that the Avg-Score of the proposed method is 77.64, 77.95, 79.05, 76.38, 75.34, and 77.00 on the (H)-DIBCO 2011, 2013, 2014, 2016, 2017, and 2018 datasets, which achieves the state-of-the-art level. The implementation code for this work is available at https://github.com/abcpp12383/ThreeStageBinarization.
Related papers
- You Only Need One Color Space: An Efficient Network for Low-light Image Enhancement [50.37253008333166]
Low-Light Image Enhancement (LLIE) task tends to restore the details and visual information from corrupted low-light images.
We propose a novel trainable color space, named Horizontal/Vertical-Intensity (HVI)
It not only decouples brightness and color from RGB channels to mitigate the instability during enhancement but also adapts to low-light images in different illumination ranges due to the trainable parameters.
arXiv Detail & Related papers (2024-02-08T16:47:43Z) - SPDGAN: A Generative Adversarial Network based on SPD Manifold Learning
for Automatic Image Colorization [1.220743263007369]
We propose a fully automatic colorization approach based on Symmetric Positive Definite (SPD) Manifold Learning with a generative adversarial network (SPDGAN)
Our model establishes an adversarial game between two discriminators and a generator. Its goal is to generate fake colorized images without losing color information across layers through residual connections.
arXiv Detail & Related papers (2023-12-21T00:52:01Z) - A Layer-Wise Tokens-to-Token Transformer Network for Improved Historical
Document Image Enhancement [13.27528507177775]
We propose textbfT2T-BinFormer which is a novel document binarization encoder-decoder architecture based on a Tokens-to-token vision transformer.
Experiments on various DIBCO and H-DIBCO benchmarks demonstrate that the proposed model outperforms the existing CNN and ViT-based state-of-the-art methods.
arXiv Detail & Related papers (2023-12-06T23:01:11Z) - CCDWT-GAN: Generative Adversarial Networks Based on Color Channel Using
Discrete Wavelet Transform for Document Image Binarization [3.0175628677371935]
This paper introduces a novelty method employing generative adversarial networks based on color channel.
The proposed method involves three stages: image preprocessing, image enhancement, and image binarization.
The experimental results demonstrate that CCDWT-GAN achieves a top two performance on multiple benchmark datasets.
arXiv Detail & Related papers (2023-05-27T08:55:56Z) - BiSTNet: Semantic Image Prior Guided Bidirectional Temporal Feature
Fusion for Deep Exemplar-based Video Colorization [70.14893481468525]
We present an effective BiSTNet to explore colors of reference exemplars and utilize them to help video colorization.
We first establish the semantic correspondence between each frame and the reference exemplars in deep feature space to explore color information from reference exemplars.
We develop a mixed expert block to extract semantic information for modeling the object boundaries of frames so that the semantic image prior can better guide the colorization process.
arXiv Detail & Related papers (2022-12-05T13:47:15Z) - Document Image Binarization in JPEG Compressed Domain using Dual
Discriminator Generative Adversarial Networks [0.0]
The proposed model has been thoroughly tested with different versions of DIBCO dataset having challenges like holes, erased or smudged ink, dust, and misplaced fibres.
The model proved to be highly robust, efficient both in terms of time and space complexities, and also resulted in state-of-the-art performance in JPEG compressed domain.
arXiv Detail & Related papers (2022-09-13T12:07:32Z) - Detecting Recolored Image by Spatial Correlation [60.08643417333974]
Image recoloring is an emerging editing technique that can manipulate the color values of an image to give it a new style.
In this paper, we explore a solution from the perspective of the spatial correlation, which exhibits the generic detection capability for both conventional and deep learning-based recoloring.
Our method achieves the state-of-the-art detection accuracy on multiple benchmark datasets and exhibits well generalization for unknown types of recoloring methods.
arXiv Detail & Related papers (2022-04-23T01:54:06Z) - Palette: Image-to-Image Diffusion Models [50.268441533631176]
We introduce Palette, a simple and general framework for image-to-image translation using conditional diffusion models.
On four challenging image-to-image translation tasks, Palette outperforms strong GAN and regression baselines.
We report several sample quality scores including FID, Inception Score, Classification Accuracy of a pre-trained ResNet-50, and Perceptual Distance against reference images.
arXiv Detail & Related papers (2021-11-10T17:49:29Z) - Two-stage generative adversarial networks for document image
binarization with color noise and background removal [7.639067237772286]
We propose a two-stage color document image enhancement and binarization method using generative adversarial neural networks.
In the first stage, four color-independent adversarial networks are trained to extract color foreground information from an input image.
In the second stage, two independent adversarial networks with global and local features are trained for image binarization of documents of variable size.
arXiv Detail & Related papers (2020-10-20T07:51:50Z) - Instance-aware Image Colorization [51.12040118366072]
In this paper, we propose a method for achieving instance-aware colorization.
Our network architecture leverages an off-the-shelf object detector to obtain cropped object images.
We use a similar network to extract the full-image features and apply a fusion module to predict the final colors.
arXiv Detail & Related papers (2020-05-21T17:59:23Z) - Learning to Structure an Image with Few Colors [59.34619548026885]
We propose a color quantization network, ColorCNN, which learns to structure the images from the classification loss in an end-to-end manner.
With only a 1-bit color space (i.e., two colors), the proposed network achieves 82.1% top-1 accuracy on the CIFAR10 dataset.
For applications, when encoded with PNG, the proposed color quantization shows superiority over other image compression methods in the extremely low bit-rate regime.
arXiv Detail & Related papers (2020-03-17T17:56:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.