High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing Net
- URL: http://arxiv.org/abs/2308.14221v4
- Date: Tue, 18 Jun 2024 06:13:00 GMT
- Title: High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing Net
- Authors: Zinuo Li, Xuhang Chen, Chi-Man Pun, Xiaodong Cun,
- Abstract summary: Shadows often occur when we capture the documents with casual equipment.
Different from the algorithms for natural shadow removal, the algorithms in document shadow removal need to preserve the details of fonts and figures in high-resolution input.
We handle high-resolution document shadow removal directly via a larger-scale real-world dataset and a carefully designed frequency-aware network.
- Score: 42.32958776152137
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Shadows often occur when we capture the documents with casual equipment, which influences the visual quality and readability of the digital copies. Different from the algorithms for natural shadow removal, the algorithms in document shadow removal need to preserve the details of fonts and figures in high-resolution input. Previous works ignore this problem and remove the shadows via approximate attention and small datasets, which might not work in real-world situations. We handle high-resolution document shadow removal directly via a larger-scale real-world dataset and a carefully designed frequency-aware network. As for the dataset, we acquire over 7k couples of high-resolution (2462 x 3699) images of real-world document pairs with various samples under different lighting circumstances, which is 10 times larger than existing datasets. As for the design of the network, we decouple the high-resolution images in the frequency domain, where the low-frequency details and high-frequency boundaries can be effectively learned via the carefully designed network structure. Powered by our network and dataset, the proposed method clearly shows a better performance than previous methods in terms of visual quality and numerical results. The code, models, and dataset are available at: https://github.com/CXH-Research/DocShadow-SD7K
Related papers
- DocDeshadower: Frequency-aware Transformer for Document Shadow Removal [49.107557554811144]
DocDeshadower is a multi-frequency Transformer-based model built on Laplacian Pyramid.
We decompose the shadow image into different frequency bands using Laplacian Pyramid.
Attention-Aggregation Network is designed to remove shadows in the low-frequency part of the image.
Gated Multi-scale Fusion Transformer refines the entire image at a global scale with its large perceptive field.
arXiv Detail & Related papers (2023-07-28T05:35:37Z) - SIDAR: Synthetic Image Dataset for Alignment & Restoration [2.9649783577150837]
There is a lack of datasets that provide enough data to train and evaluate end-to-end deep learning models.
Our proposed data augmentation helps to overcome the issue of data scarcity by using 3D rendering.
The resulting dataset can serve as a training and evaluation set for a multitude of tasks involving image alignment and artifact removal.
arXiv Detail & Related papers (2023-05-19T23:32:06Z) - ShaDocNet: Learning Spatial-Aware Tokens in Transformer for Document
Shadow Removal [53.01990632289937]
We propose a Transformer-based model for document shadow removal.
It uses shadow context encoding and decoding in both shadow and shadow-free regions.
arXiv Detail & Related papers (2022-11-30T01:46:29Z) - RTMV: A Ray-Traced Multi-View Synthetic Dataset for Novel View Synthesis [104.53930611219654]
We present a large-scale synthetic dataset for novel view synthesis consisting of 300k images rendered from nearly 2000 complex scenes.
The dataset is orders of magnitude larger than existing synthetic datasets for novel view synthesis.
Using 4 distinct sources of high-quality 3D meshes, the scenes of our dataset exhibit challenging variations in camera views, lighting, shape, materials, and textures.
arXiv Detail & Related papers (2022-05-14T13:15:32Z) - Learning Neural Light Fields with Ray-Space Embedding Networks [51.88457861982689]
We propose a novel neural light field representation that is compact and directly predicts integrated radiance along rays.
Our method achieves state-of-the-art quality on dense forward-facing datasets such as the Stanford Light Field dataset.
arXiv Detail & Related papers (2021-12-02T18:59:51Z) - R2D: Learning Shadow Removal to Enhance Fine-Context Shadow Detection [64.10636296274168]
Current shadow detection methods perform poorly when detecting shadow regions that are small, unclear or have blurry edges.
We propose a new method called Restore to Detect (R2D), where a deep neural network is trained for restoration (shadow removal)
We show that our proposed method R2D improves the shadow detection performance while being able to detect fine context better compared to the other recent methods.
arXiv Detail & Related papers (2021-09-20T15:09:22Z) - Light-weight Document Image Cleanup using Perceptual Loss [7.106986689736828]
We propose a light-weight encoder based convolutional neural network architecture for removing the noisy elements from document images.
In terms of the number of parameters and product-sum operations, our models are 65-1030 and 3-27 times, respectively, smaller than existing document enhancement models.
arXiv Detail & Related papers (2021-05-19T11:54:28Z) - Learning to Dehaze from Realistic Scene with A Fast Physics-based
Dehazing Network [26.92874873109654]
We present a new, large, and diverse dehazing dataset containing real outdoor scenes from High-Definition (HD) 3D movies.
We also propose a light and reliable dehazing network inspired by the physics model.
Our approach outperforms other methods by a large margin and becomes the new state-of-the-art method.
arXiv Detail & Related papers (2020-04-18T08:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.