High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing Net
- URL: http://arxiv.org/abs/2308.14221v4
- Date: Tue, 18 Jun 2024 06:13:00 GMT
- Title: High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing Net
- Authors: Zinuo Li, Xuhang Chen, Chi-Man Pun, Xiaodong Cun,
- Abstract summary: Shadows often occur when we capture the documents with casual equipment.
Different from the algorithms for natural shadow removal, the algorithms in document shadow removal need to preserve the details of fonts and figures in high-resolution input.
We handle high-resolution document shadow removal directly via a larger-scale real-world dataset and a carefully designed frequency-aware network.
- Score: 42.32958776152137
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Shadows often occur when we capture the documents with casual equipment, which influences the visual quality and readability of the digital copies. Different from the algorithms for natural shadow removal, the algorithms in document shadow removal need to preserve the details of fonts and figures in high-resolution input. Previous works ignore this problem and remove the shadows via approximate attention and small datasets, which might not work in real-world situations. We handle high-resolution document shadow removal directly via a larger-scale real-world dataset and a carefully designed frequency-aware network. As for the dataset, we acquire over 7k couples of high-resolution (2462 x 3699) images of real-world document pairs with various samples under different lighting circumstances, which is 10 times larger than existing datasets. As for the design of the network, we decouple the high-resolution images in the frequency domain, where the low-frequency details and high-frequency boundaries can be effectively learned via the carefully designed network structure. Powered by our network and dataset, the proposed method clearly shows a better performance than previous methods in terms of visual quality and numerical results. The code, models, and dataset are available at: https://github.com/CXH-Research/DocShadow-SD7K
Related papers
- Soft-Hard Attention U-Net Model and Benchmark Dataset for Multiscale Image Shadow Removal [2.999888908665659]
This study proposes a novel deep learning architecture, named Soft-Hard Attention U-net (SHAU), focusing on multiscale shadow removal.
It provides a novel synthetic dataset, named Multiscale Shadow Removal dataset (MSRD), containing complex shadow patterns of multiple scales.
The results demonstrate the effectiveness of SHAU over the relevant state-of-the-art shadow removal methods across various benchmark datasets.
arXiv Detail & Related papers (2024-08-07T12:42:06Z) - DocDeshadower: Frequency-Aware Transformer for Document Shadow Removal [36.182923899021496]
Current shadow removal techniques face limitations in handling varying shadow intensities and preserving document details.
We propose DocDeshadower, a novel multi-frequency Transformer-based model built upon the Laplacian Pyramid.
Experiments demonstrate DocDeshadower's superior performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-07-28T05:35:37Z) - SIDAR: Synthetic Image Dataset for Alignment & Restoration [2.9649783577150837]
There is a lack of datasets that provide enough data to train and evaluate end-to-end deep learning models.
Our proposed data augmentation helps to overcome the issue of data scarcity by using 3D rendering.
The resulting dataset can serve as a training and evaluation set for a multitude of tasks involving image alignment and artifact removal.
arXiv Detail & Related papers (2023-05-19T23:32:06Z) - ShaDocNet: Learning Spatial-Aware Tokens in Transformer for Document
Shadow Removal [53.01990632289937]
We propose a Transformer-based model for document shadow removal.
It uses shadow context encoding and decoding in both shadow and shadow-free regions.
arXiv Detail & Related papers (2022-11-30T01:46:29Z) - RTMV: A Ray-Traced Multi-View Synthetic Dataset for Novel View Synthesis [104.53930611219654]
We present a large-scale synthetic dataset for novel view synthesis consisting of 300k images rendered from nearly 2000 complex scenes.
The dataset is orders of magnitude larger than existing synthetic datasets for novel view synthesis.
Using 4 distinct sources of high-quality 3D meshes, the scenes of our dataset exhibit challenging variations in camera views, lighting, shape, materials, and textures.
arXiv Detail & Related papers (2022-05-14T13:15:32Z) - High-resolution Iterative Feedback Network for Camouflaged Object
Detection [128.893782016078]
Spotting camouflaged objects that are visually assimilated into the background is tricky for object detection algorithms.
We aim to extract the high-resolution texture details to avoid the detail degradation that causes blurred vision in edges and boundaries.
We introduce a novel HitNet to refine the low-resolution representations by high-resolution features in an iterative feedback manner.
arXiv Detail & Related papers (2022-03-22T11:20:21Z) - Learning Neural Light Fields with Ray-Space Embedding Networks [51.88457861982689]
We propose a novel neural light field representation that is compact and directly predicts integrated radiance along rays.
Our method achieves state-of-the-art quality on dense forward-facing datasets such as the Stanford Light Field dataset.
arXiv Detail & Related papers (2021-12-02T18:59:51Z) - R2D: Learning Shadow Removal to Enhance Fine-Context Shadow Detection [64.10636296274168]
Current shadow detection methods perform poorly when detecting shadow regions that are small, unclear or have blurry edges.
We propose a new method called Restore to Detect (R2D), where a deep neural network is trained for restoration (shadow removal)
We show that our proposed method R2D improves the shadow detection performance while being able to detect fine context better compared to the other recent methods.
arXiv Detail & Related papers (2021-09-20T15:09:22Z) - Learning to Dehaze from Realistic Scene with A Fast Physics-based
Dehazing Network [26.92874873109654]
We present a new, large, and diverse dehazing dataset containing real outdoor scenes from High-Definition (HD) 3D movies.
We also propose a light and reliable dehazing network inspired by the physics model.
Our approach outperforms other methods by a large margin and becomes the new state-of-the-art method.
arXiv Detail & Related papers (2020-04-18T08:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.