ShaDocFormer: A Shadow-Attentive Threshold Detector With Cascaded Fusion Refiner for Document Shadow Removal
- URL: http://arxiv.org/abs/2309.06670v4
- Date: Thu, 21 Mar 2024 08:30:54 GMT
- Title: ShaDocFormer: A Shadow-Attentive Threshold Detector With Cascaded Fusion Refiner for Document Shadow Removal
- Authors: Weiwen Chen, Yingtie Lei, Shenghong Luo, Ziyang Zhou, Mingxian Li, Chi-Man Pun,
- Abstract summary: We propose a Transformer-based architecture that integrates traditional methodologies and deep learning techniques to tackle the problem of document shadow removal.
The ShaDocFormer architecture comprises two components: the Shadow-attentive Threshold Detector (STD) and the Cascaded Fusion Refiner (CFR)
- Score: 26.15238399758745
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Document shadow is a common issue that arises when capturing documents using mobile devices, which significantly impacts readability. Current methods encounter various challenges, including inaccurate detection of shadow masks and estimation of illumination. In this paper, we propose ShaDocFormer, a Transformer-based architecture that integrates traditional methodologies and deep learning techniques to tackle the problem of document shadow removal. The ShaDocFormer architecture comprises two components: the Shadow-attentive Threshold Detector (STD) and the Cascaded Fusion Refiner (CFR). The STD module employs a traditional thresholding technique and leverages the attention mechanism of the Transformer to gather global information, thereby enabling precise detection of shadow masks. The cascaded and aggregative structure of the CFR module facilitates a coarse-to-fine restoration process for the entire image. As a result, ShaDocFormer excels in accurately detecting and capturing variations in both shadow and illumination, thereby enabling effective removal of shadows. Extensive experiments demonstrate that ShaDocFormer outperforms current state-of-the-art methods in both qualitative and quantitative measurements.
Related papers
- SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection [90.4751446041017]
We present SwinShadow, a transformer-based architecture that fully utilizes the powerful shifted window mechanism for detecting adjacent shadows.
The whole process can be divided into three parts: encoder, decoder, and feature integration.
Experiments on three shadow detection benchmark datasets, SBU, UCF, and ISTD, demonstrate that our network achieves good performance in terms of balance error rate (BER)
arXiv Detail & Related papers (2024-08-07T03:16:33Z) - ShadowMaskFormer: Mask Augmented Patch Embeddings for Shadow Removal [13.983288991595614]
We propose a transformer-based framework with a novel patch embedding that is tailored for shadow removal, dubbed ShadowMaskFormer.
Specifically, we present a simple and effective mask-augmented patch embedding to integrate shadow information and promote the model's emphasis on acquiring knowledge for shadow regions.
arXiv Detail & Related papers (2024-04-29T05:17:33Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - DocDeshadower: Frequency-Aware Transformer for Document Shadow Removal [36.182923899021496]
Current shadow removal techniques face limitations in handling varying shadow intensities and preserving document details.
We propose DocDeshadower, a novel multi-frequency Transformer-based model built upon the Laplacian Pyramid.
Experiments demonstrate DocDeshadower's superior performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-07-28T05:35:37Z) - Structure-Informed Shadow Removal Networks [67.57092870994029]
Existing deep learning-based shadow removal methods still produce images with shadow remnants.
We propose a novel structure-informed shadow removal network (StructNet) to leverage the image-structure information to address the shadow remnant problem.
Our method outperforms existing shadow removal methods, and our StructNet can be integrated with existing methods to improve them further.
arXiv Detail & Related papers (2023-01-09T06:31:52Z) - ShaDocNet: Learning Spatial-Aware Tokens in Transformer for Document
Shadow Removal [53.01990632289937]
We propose a Transformer-based model for document shadow removal.
It uses shadow context encoding and decoding in both shadow and shadow-free regions.
arXiv Detail & Related papers (2022-11-30T01:46:29Z) - SpA-Former: Transformer image shadow detection and removal via spatial
attention [8.643096072885909]
We propose an end-to-end SpA-Former to recover a shadow-free image from a single shaded image.
Unlike traditional methods that require two steps for shadow detection and then shadow removal, the SpA-Former unifies these steps into one.
arXiv Detail & Related papers (2022-06-22T08:30:22Z) - DocScanner: Robust Document Image Rectification with Progressive
Learning [162.03694280524084]
This work presents DocScanner, a new deep network architecture for document image rectification.
DocScanner maintains a single estimate of the rectified image, which is progressively corrected with a recurrent architecture.
The iterative refinements make DocScanner converge to a robust and superior performance, and the lightweight recurrent architecture ensures the running efficiency.
arXiv Detail & Related papers (2021-10-28T09:15:02Z) - R2D: Learning Shadow Removal to Enhance Fine-Context Shadow Detection [64.10636296274168]
Current shadow detection methods perform poorly when detecting shadow regions that are small, unclear or have blurry edges.
We propose a new method called Restore to Detect (R2D), where a deep neural network is trained for restoration (shadow removal)
We show that our proposed method R2D improves the shadow detection performance while being able to detect fine context better compared to the other recent methods.
arXiv Detail & Related papers (2021-09-20T15:09:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.