Related papers: MatteViT: High-Frequency-Aware Document Shadow Removal with Shadow Matte Guidance

MatteViT: High-Frequency-Aware Document Shadow Removal with Shadow Matte Guidance

URL: http://arxiv.org/abs/2512.08789v1
Date: Tue, 09 Dec 2025 16:40:10 GMT
Title: MatteViT: High-Frequency-Aware Document Shadow Removal with Shadow Matte Guidance
Authors: Chaewon Kim, Seoyeon Lee, Jonghyuk Park,
Abstract summary: Document shadow removal is essential for enhancing the clarity of digitized documents.<n>This paper proposes a matte vision transformer (MatteViT) to eliminate shadows while preserving fine-grained structural details.
Score: 8.823244071737868
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Document shadow removal is essential for enhancing the clarity of digitized documents. Preserving high-frequency details (e.g., text edges and lines) is critical in this process because shadows often obscure or distort fine structures. This paper proposes a matte vision transformer (MatteViT), a novel shadow removal framework that applies spatial and frequency-domain information to eliminate shadows while preserving fine-grained structural details. To effectively retain these details, we employ two preservation strategies. First, our method introduces a lightweight high-frequency amplification module (HFAM) that decomposes and adaptively amplifies high-frequency components. Second, we present a continuous luminance-based shadow matte, generated using a custom-built matte dataset and shadow matte generator, which provides precise spatial guidance from the earliest processing stage. These strategies enable the model to accurately identify fine-grained regions and restore them with high fidelity. Extensive experiments on public benchmarks (RDD and Kligler) demonstrate that MatteViT achieves state-of-the-art performance, providing a robust and practical solution for real-world document shadow removal. Furthermore, the proposed method better preserves text-level details in downstream tasks, such as optical character recognition, improving recognition performance over prior methods.

Related papers

DocShaDiffusion: Diffusion Model in Latent Space for Document Image Shadow Removal [61.375359734723716]
Existing methods tend to remove shadows with constant color background and ignore color shadows.<n>In this paper, we first design a diffusion model in latent space for document image shadow removal, called DocShaDiffusion.<n>To address the issue of color shadows, we design a shadow soft-mask generation module (SSGM)<n>A shadow mask-aware guided diffusion module (SMGDM) is proposed to remove shadows from document images by supervising the diffusion and denoising process.
arXiv Detail & Related papers (2025-07-02T07:22:09Z)
Leveraging Contrast Information for Efficient Document Shadow Removal [15.35209972174416]
Document shadows are a major obstacle in the digitization process.<n>We propose an end-to-end document shadow removal method guided by contrast representation.
arXiv Detail & Related papers (2025-04-01T03:06:20Z)
Detail-Preserving Latent Diffusion for Stable Shadow Removal [24.18957090960958]
We propose a two-stage fine-tuning pipeline to adapt the Stable Diffusion model for stable and efficient shadow removal.<n> Experimental results show that our method outperforms state-of-the-art shadow removal techniques.
arXiv Detail & Related papers (2024-12-23T15:06:46Z)
MetaShadow: Object-Centered Shadow Detection, Removal, and Synthesis [64.00425120075045]
Shadows are often under-considered or even ignored in image editing applications, limiting the realism of the edited results.<n>In this paper, we introduce MetaShadow, a three-in-one versatile framework that enables detection, removal, and controllable synthesis of shadows in natural images in an object-centered fashion.
arXiv Detail & Related papers (2024-12-03T18:04:42Z)
ShadowMamba: State-Space Model with Boundary-Region Selective Scan for Shadow Removal [3.5734732877967392]
This paper presents a model called ShadowMamba, the first Mamba-based model designed for shadow removal.<n> Experimental results show that the proposed method outperforms existing mainstream approaches on the AISTD, ISTD, and SRD datasets.
arXiv Detail & Related papers (2024-11-05T16:59:06Z)
SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection [90.4751446041017]
We present SwinShadow, a transformer-based architecture that fully utilizes the powerful shifted window mechanism for detecting adjacent shadows. The whole process can be divided into three parts: encoder, decoder, and feature integration. Experiments on three shadow detection benchmark datasets, SBU, UCF, and ISTD, demonstrate that our network achieves good performance in terms of balance error rate (BER)
arXiv Detail & Related papers (2024-08-07T03:16:33Z)
Latent Feature-Guided Diffusion Models for Shadow Removal [47.21387783721207]
We propose the use of diffusion models as they offer a promising approach to gradually refine the details of shadow regions during the diffusion process.<n>Our method improves this process by conditioning on a learned latent feature space that inherits the characteristics of shadow-free images.<n>We demonstrate the effectiveness of our approach which outperforms the previous best method by 13% in terms of RMSE on the AISTD dataset.
arXiv Detail & Related papers (2023-12-04T18:59:55Z)
SILT: Shadow-aware Iterative Label Tuning for Learning to Detect Shadows from Noisy Labels [53.30604926018168]
We propose SILT, the Shadow-aware Iterative Label Tuning framework, which explicitly considers noise in shadow labels and trains the deep model in a self-training manner. We also devise a simple yet effective label tuning strategy with global-local fusion and shadow-aware filtering to encourage the network to make significant refinements on the noisy labels. Our results show that even a simple U-Net trained with SILT can outperform all state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2023-08-23T11:16:36Z)
DocDeshadower: Frequency-Aware Transformer for Document Shadow Removal [36.182923899021496]
Current shadow removal techniques face limitations in handling varying shadow intensities and preserving document details. We propose DocDeshadower, a novel multi-frequency Transformer-based model built upon the Laplacian Pyramid. Experiments demonstrate DocDeshadower's superior performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-07-28T05:35:37Z)
Structure-Informed Shadow Removal Networks [67.57092870994029]
Existing deep learning-based shadow removal methods still produce images with shadow remnants. We propose a novel structure-informed shadow removal network (StructNet) to leverage the image-structure information to address the shadow remnant problem. Our method outperforms existing shadow removal methods, and our StructNet can be integrated with existing methods to improve them further.
arXiv Detail & Related papers (2023-01-09T06:31:52Z)
ShaDocNet: Learning Spatial-Aware Tokens in Transformer for Document Shadow Removal [53.01990632289937]
We propose a Transformer-based model for document shadow removal. It uses shadow context encoding and decoding in both shadow and shadow-free regions.
arXiv Detail & Related papers (2022-11-30T01:46:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.