MatteViT: High-Frequency-Aware Document Shadow Removal with Shadow Matte Guidance
- URL: http://arxiv.org/abs/2512.08789v1
- Date: Tue, 09 Dec 2025 16:40:10 GMT
- Title: MatteViT: High-Frequency-Aware Document Shadow Removal with Shadow Matte Guidance
- Authors: Chaewon Kim, Seoyeon Lee, Jonghyuk Park,
- Abstract summary: Document shadow removal is essential for enhancing the clarity of digitized documents.<n>This paper proposes a matte vision transformer (MatteViT) to eliminate shadows while preserving fine-grained structural details.
- Score: 8.823244071737868
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Document shadow removal is essential for enhancing the clarity of digitized documents. Preserving high-frequency details (e.g., text edges and lines) is critical in this process because shadows often obscure or distort fine structures. This paper proposes a matte vision transformer (MatteViT), a novel shadow removal framework that applies spatial and frequency-domain information to eliminate shadows while preserving fine-grained structural details. To effectively retain these details, we employ two preservation strategies. First, our method introduces a lightweight high-frequency amplification module (HFAM) that decomposes and adaptively amplifies high-frequency components. Second, we present a continuous luminance-based shadow matte, generated using a custom-built matte dataset and shadow matte generator, which provides precise spatial guidance from the earliest processing stage. These strategies enable the model to accurately identify fine-grained regions and restore them with high fidelity. Extensive experiments on public benchmarks (RDD and Kligler) demonstrate that MatteViT achieves state-of-the-art performance, providing a robust and practical solution for real-world document shadow removal. Furthermore, the proposed method better preserves text-level details in downstream tasks, such as optical character recognition, improving recognition performance over prior methods.
Related papers
- DocShaDiffusion: Diffusion Model in Latent Space for Document Image Shadow Removal [61.375359734723716]
Existing methods tend to remove shadows with constant color background and ignore color shadows.<n>In this paper, we first design a diffusion model in latent space for document image shadow removal, called DocShaDiffusion.<n>To address the issue of color shadows, we design a shadow soft-mask generation module (SSGM)<n>A shadow mask-aware guided diffusion module (SMGDM) is proposed to remove shadows from document images by supervising the diffusion and denoising process.
arXiv Detail & Related papers (2025-07-02T07:22:09Z) - Leveraging Contrast Information for Efficient Document Shadow Removal [15.35209972174416]
Document shadows are a major obstacle in the digitization process.<n>We propose an end-to-end document shadow removal method guided by contrast representation.
arXiv Detail & Related papers (2025-04-01T03:06:20Z) - Detail-Preserving Latent Diffusion for Stable Shadow Removal [24.18957090960958]
We propose a two-stage fine-tuning pipeline to adapt the Stable Diffusion model for stable and efficient shadow removal.<n> Experimental results show that our method outperforms state-of-the-art shadow removal techniques.
arXiv Detail & Related papers (2024-12-23T15:06:46Z) - MetaShadow: Object-Centered Shadow Detection, Removal, and Synthesis [64.00425120075045]
Shadows are often under-considered or even ignored in image editing applications, limiting the realism of the edited results.<n>In this paper, we introduce MetaShadow, a three-in-one versatile framework that enables detection, removal, and controllable synthesis of shadows in natural images in an object-centered fashion.
arXiv Detail & Related papers (2024-12-03T18:04:42Z) - ShadowMamba: State-Space Model with Boundary-Region Selective Scan for Shadow Removal [3.5734732877967392]
This paper presents a model called ShadowMamba, the first Mamba-based model designed for shadow removal.<n> Experimental results show that the proposed method outperforms existing mainstream approaches on the AISTD, ISTD, and SRD datasets.
arXiv Detail & Related papers (2024-11-05T16:59:06Z) - SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection [90.4751446041017]
We present SwinShadow, a transformer-based architecture that fully utilizes the powerful shifted window mechanism for detecting adjacent shadows.
The whole process can be divided into three parts: encoder, decoder, and feature integration.
Experiments on three shadow detection benchmark datasets, SBU, UCF, and ISTD, demonstrate that our network achieves good performance in terms of balance error rate (BER)
arXiv Detail & Related papers (2024-08-07T03:16:33Z) - Latent Feature-Guided Diffusion Models for Shadow Removal [47.21387783721207]
We propose the use of diffusion models as they offer a promising approach to gradually refine the details of shadow regions during the diffusion process.<n>Our method improves this process by conditioning on a learned latent feature space that inherits the characteristics of shadow-free images.<n>We demonstrate the effectiveness of our approach which outperforms the previous best method by 13% in terms of RMSE on the AISTD dataset.
arXiv Detail & Related papers (2023-12-04T18:59:55Z) - SILT: Shadow-aware Iterative Label Tuning for Learning to Detect Shadows
from Noisy Labels [53.30604926018168]
We propose SILT, the Shadow-aware Iterative Label Tuning framework, which explicitly considers noise in shadow labels and trains the deep model in a self-training manner.
We also devise a simple yet effective label tuning strategy with global-local fusion and shadow-aware filtering to encourage the network to make significant refinements on the noisy labels.
Our results show that even a simple U-Net trained with SILT can outperform all state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2023-08-23T11:16:36Z) - DocDeshadower: Frequency-Aware Transformer for Document Shadow Removal [36.182923899021496]
Current shadow removal techniques face limitations in handling varying shadow intensities and preserving document details.
We propose DocDeshadower, a novel multi-frequency Transformer-based model built upon the Laplacian Pyramid.
Experiments demonstrate DocDeshadower's superior performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-07-28T05:35:37Z) - Structure-Informed Shadow Removal Networks [67.57092870994029]
Existing deep learning-based shadow removal methods still produce images with shadow remnants.
We propose a novel structure-informed shadow removal network (StructNet) to leverage the image-structure information to address the shadow remnant problem.
Our method outperforms existing shadow removal methods, and our StructNet can be integrated with existing methods to improve them further.
arXiv Detail & Related papers (2023-01-09T06:31:52Z) - ShaDocNet: Learning Spatial-Aware Tokens in Transformer for Document
Shadow Removal [53.01990632289937]
We propose a Transformer-based model for document shadow removal.
It uses shadow context encoding and decoding in both shadow and shadow-free regions.
arXiv Detail & Related papers (2022-11-30T01:46:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.