Related papers: ShaDocFormer: A Shadow-Attentive Threshold Detector With Cascaded Fusion Refiner for Document Shadow Removal

ShaDocFormer: A Shadow-Attentive Threshold Detector With Cascaded Fusion Refiner for Document Shadow Removal

URL: http://arxiv.org/abs/2309.06670v4
Date: Thu, 21 Mar 2024 08:30:54 GMT
Title: ShaDocFormer: A Shadow-Attentive Threshold Detector With Cascaded Fusion Refiner for Document Shadow Removal
Authors: Weiwen Chen, Yingtie Lei, Shenghong Luo, Ziyang Zhou, Mingxian Li, Chi-Man Pun,
Abstract summary: We propose a Transformer-based architecture that integrates traditional methodologies and deep learning techniques to tackle the problem of document shadow removal. The ShaDocFormer architecture comprises two components: the Shadow-attentive Threshold Detector (STD) and the Cascaded Fusion Refiner (CFR)
Score: 26.15238399758745
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Document shadow is a common issue that arises when capturing documents using mobile devices, which significantly impacts readability. Current methods encounter various challenges, including inaccurate detection of shadow masks and estimation of illumination. In this paper, we propose ShaDocFormer, a Transformer-based architecture that integrates traditional methodologies and deep learning techniques to tackle the problem of document shadow removal. The ShaDocFormer architecture comprises two components: the Shadow-attentive Threshold Detector (STD) and the Cascaded Fusion Refiner (CFR). The STD module employs a traditional thresholding technique and leverages the attention mechanism of the Transformer to gather global information, thereby enabling precise detection of shadow masks. The cascaded and aggregative structure of the CFR module facilitates a coarse-to-fine restoration process for the entire image. As a result, ShaDocFormer excels in accurately detecting and capturing variations in both shadow and illumination, thereby enabling effective removal of shadows. Extensive experiments demonstrate that ShaDocFormer outperforms current state-of-the-art methods in both qualitative and quantitative measurements.

Related papers

MatteViT: High-Frequency-Aware Document Shadow Removal with Shadow Matte Guidance [8.823244071737868]
Document shadow removal is essential for enhancing the clarity of digitized documents.<n>This paper proposes a matte vision transformer (MatteViT) to eliminate shadows while preserving fine-grained structural details.
arXiv Detail & Related papers (2025-12-09T16:40:10Z)
DocShaDiffusion: Diffusion Model in Latent Space for Document Image Shadow Removal [61.375359734723716]
Existing methods tend to remove shadows with constant color background and ignore color shadows.<n>In this paper, we first design a diffusion model in latent space for document image shadow removal, called DocShaDiffusion.<n>To address the issue of color shadows, we design a shadow soft-mask generation module (SSGM)<n>A shadow mask-aware guided diffusion module (SMGDM) is proposed to remove shadows from document images by supervising the diffusion and denoising process.
arXiv Detail & Related papers (2025-07-02T07:22:09Z)
Leveraging Contrast Information for Efficient Document Shadow Removal [15.35209972174416]
Document shadows are a major obstacle in the digitization process. We propose an end-to-end document shadow removal method guided by contrast representation.
arXiv Detail & Related papers (2025-04-01T03:06:20Z)
MetaShadow: Object-Centered Shadow Detection, Removal, and Synthesis [64.00425120075045]
Shadows are often under-considered or even ignored in image editing applications, limiting the realism of the edited results. In this paper, we introduce MetaShadow, a three-in-one versatile framework that enables detection, removal, and controllable synthesis of shadows in natural images in an object-centered fashion.
arXiv Detail & Related papers (2024-12-03T18:04:42Z)
ShadowMamba: State-Space Model with Boundary-Region Selective Scan for Shadow Removal [3.5734732877967392]
Shadows cause sudden brightness changes in some areas, which can affect the accuracy of downstream tasks. We propose a new boundary-region selective scanning mechanism that scans shadow, boundary, and non-shadow regions separately. We design the first Mamba-based lightweight shadow removal model, called ShadowMamba.
arXiv Detail & Related papers (2024-11-05T16:59:06Z)
SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection [90.4751446041017]
We present SwinShadow, a transformer-based architecture that fully utilizes the powerful shifted window mechanism for detecting adjacent shadows. The whole process can be divided into three parts: encoder, decoder, and feature integration. Experiments on three shadow detection benchmark datasets, SBU, UCF, and ISTD, demonstrate that our network achieves good performance in terms of balance error rate (BER)
arXiv Detail & Related papers (2024-08-07T03:16:33Z)
ShadowMaskFormer: Mask Augmented Patch Embeddings for Shadow Removal [13.983288991595614]
We propose a transformer-based framework with a novel patch embedding that is tailored for shadow removal, dubbed ShadowMaskFormer. Specifically, we present a simple and effective mask-augmented patch embedding to integrate shadow information and promote the model's emphasis on acquiring knowledge for shadow regions.
arXiv Detail & Related papers (2024-04-29T05:17:33Z)
Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust. Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model. We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z)
DocDeshadower: Frequency-Aware Transformer for Document Shadow Removal [36.182923899021496]
Current shadow removal techniques face limitations in handling varying shadow intensities and preserving document details. We propose DocDeshadower, a novel multi-frequency Transformer-based model built upon the Laplacian Pyramid. Experiments demonstrate DocDeshadower's superior performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-07-28T05:35:37Z)
Structure-Informed Shadow Removal Networks [67.57092870994029]
Existing deep learning-based shadow removal methods still produce images with shadow remnants. We propose a novel structure-informed shadow removal network (StructNet) to leverage the image-structure information to address the shadow remnant problem. Our method outperforms existing shadow removal methods, and our StructNet can be integrated with existing methods to improve them further.
arXiv Detail & Related papers (2023-01-09T06:31:52Z)
ShaDocNet: Learning Spatial-Aware Tokens in Transformer for Document Shadow Removal [53.01990632289937]
We propose a Transformer-based model for document shadow removal. It uses shadow context encoding and decoding in both shadow and shadow-free regions.
arXiv Detail & Related papers (2022-11-30T01:46:29Z)
SpA-Former: Transformer image shadow detection and removal via spatial attention [8.643096072885909]
We propose an end-to-end SpA-Former to recover a shadow-free image from a single shaded image. Unlike traditional methods that require two steps for shadow detection and then shadow removal, the SpA-Former unifies these steps into one.
arXiv Detail & Related papers (2022-06-22T08:30:22Z)
DocScanner: Robust Document Image Rectification with Progressive Learning [162.03694280524084]
This work presents DocScanner, a new deep network architecture for document image rectification. DocScanner maintains a single estimate of the rectified image, which is progressively corrected with a recurrent architecture. The iterative refinements make DocScanner converge to a robust and superior performance, and the lightweight recurrent architecture ensures the running efficiency.
arXiv Detail & Related papers (2021-10-28T09:15:02Z)
R2D: Learning Shadow Removal to Enhance Fine-Context Shadow Detection [64.10636296274168]
Current shadow detection methods perform poorly when detecting shadow regions that are small, unclear or have blurry edges. We propose a new method called Restore to Detect (R2D), where a deep neural network is trained for restoration (shadow removal) We show that our proposed method R2D improves the shadow detection performance while being able to detect fine context better compared to the other recent methods.
arXiv Detail & Related papers (2021-09-20T15:09:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.