Related papers: Retinex-guided Histogram Transformer for Mask-free Shadow Removal

Retinex-guided Histogram Transformer for Mask-free Shadow Removal

URL: http://arxiv.org/abs/2504.14092v1
Date: Fri, 18 Apr 2025 22:19:40 GMT
Title: Retinex-guided Histogram Transformer for Mask-free Shadow Removal
Authors: Wei Dong, Han Zhou, Seyed Amirreza Mousavi, Jun Chen,
Abstract summary: ReHiT is an efficient mask-free shadow removal framework based on a hybrid CNN-Transformer architecture guided by Retinex theory.<n>Our solution delivers competitive results with one of the smallest parameter sizes and fastest inference speeds among top-ranked entries.
Score: 12.962534359029103
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While deep learning methods have achieved notable progress in shadow removal, many existing approaches rely on shadow masks that are difficult to obtain, limiting their generalization to real-world scenes. In this work, we propose ReHiT, an efficient mask-free shadow removal framework based on a hybrid CNN-Transformer architecture guided by Retinex theory. We first introduce a dual-branch pipeline to separately model reflectance and illumination components, and each is restored by our developed Illumination-Guided Hybrid CNN-Transformer (IG-HCT) module. Second, besides the CNN-based blocks that are capable of learning residual dense features and performing multi-scale semantic fusion, multi-scale semantic fusion, we develop the Illumination-Guided Histogram Transformer Block (IGHB) to effectively handle non-uniform illumination and spatially complex shadows. Extensive experiments on several benchmark datasets validate the effectiveness of our approach over existing mask-free methods. Trained solely on the NTIRE 2025 Shadow Removal Challenge dataset, our solution delivers competitive results with one of the smallest parameter sizes and fastest inference speeds among top-ranked entries, highlighting its applicability for real-world applications with limited computational resources. The code is available at https://github.com/dongw22/oath.

Related papers

WavShadow: Wavelet Based Shadow Segmentation and Removal [0.0]
This study presents a novel approach that enhances the ShadowFormer model by incorporating Masked Autoencoder (MAE) priors and Fast Fourier Convolution (FFC) blocks. We introduce key innovations: (1) integration of MAE priors trained on Places2 dataset for better context understanding, (2) adoption of Haar wavelet features for enhanced edge detection and multiscale analysis, and (3) implementation of a modified SAM Adapter for robust shadow segmentation.
arXiv Detail & Related papers (2024-11-08T18:08:33Z)
ShadowMamba: State-Space Model with Boundary-Region Selective Scan for Shadow Removal [3.5734732877967392]
Shadows cause sudden brightness changes in some areas, which can affect the accuracy of downstream tasks.<n>We propose a new boundary-region selective scanning mechanism that scans shadow, boundary, and non-shadow regions separately.<n>We design the first Mamba-based lightweight shadow removal model, called ShadowMamba.
arXiv Detail & Related papers (2024-11-05T16:59:06Z)
RelitLRM: Generative Relightable Radiance for Large Reconstruction Models [52.672706620003765]
We propose RelitLRM for generating high-quality Gaussian splatting representations of 3D objects under novel illuminations. Unlike prior inverse rendering methods requiring dense captures and slow optimization, RelitLRM adopts a feed-forward transformer-based model. We show our sparse-view feed-forward RelitLRM offers competitive relighting results to state-of-the-art dense-view optimization-based baselines.
arXiv Detail & Related papers (2024-10-08T17:40:01Z)
SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection [90.4751446041017]
We present SwinShadow, a transformer-based architecture that fully utilizes the powerful shifted window mechanism for detecting adjacent shadows. The whole process can be divided into three parts: encoder, decoder, and feature integration. Experiments on three shadow detection benchmark datasets, SBU, UCF, and ISTD, demonstrate that our network achieves good performance in terms of balance error rate (BER)
arXiv Detail & Related papers (2024-08-07T03:16:33Z)
ShadowMaskFormer: Mask Augmented Patch Embeddings for Shadow Removal [13.983288991595614]
We propose a transformer-based framework with a novel patch embedding that is tailored for shadow removal.<n>We present a simple and effective mask-augmented patch embedding to integrate shadow information and promote the model's emphasis on acquiring knowledge for shadow regions.
arXiv Detail & Related papers (2024-04-29T05:17:33Z)
ShadowRefiner: Towards Mask-free Shadow Removal via Fast Fourier Transformer [41.008740643546226]
Shadow-affected images often exhibit pronounced spatial discrepancies in color and illumination. We introduce a mask-free Shadow Removal and Refinement network (ShadowRefiner) via Fast Fourier Transformer. Our method wins the championship in the Perceptual Track and achieves the second best performance in the Fidelity Track of NTIRE 2024 Image Shadow Removal Challenge.
arXiv Detail & Related papers (2024-04-18T03:53:33Z)
Towards Image Ambient Lighting Normalization [47.42834070783831]
Ambient Lighting Normalization (ALN) enables the study of interactions between shadows, unifying image restoration and shadow removal in a broader context. For benchmarking, we select various mainstream methods and rigorously evaluate them on Ambient6K. Experiments show that IFBlend achieves SOTA scores on Ambient6K and exhibits competitive performance on conventional shadow removal benchmarks.
arXiv Detail & Related papers (2024-03-27T16:20:55Z)
Progressive Recurrent Network for Shadow Removal [99.1928825224358]
Single-image shadow removal is a significant task that is still unresolved. Most existing deep learning-based approaches attempt to remove the shadow directly, which can not deal with the shadow well. We propose a simple but effective Progressive Recurrent Network (PRNet) to remove the shadow progressively.
arXiv Detail & Related papers (2023-11-01T11:42:45Z)
DocDeshadower: Frequency-Aware Transformer for Document Shadow Removal [36.182923899021496]
Current shadow removal techniques face limitations in handling varying shadow intensities and preserving document details. We propose DocDeshadower, a novel multi-frequency Transformer-based model built upon the Laplacian Pyramid. Experiments demonstrate DocDeshadower's superior performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-07-28T05:35:37Z)
Multi-scale Transformer Network with Edge-aware Pre-training for Cross-Modality MR Image Synthesis [52.41439725865149]
Cross-modality magnetic resonance (MR) image synthesis can be used to generate missing modalities from given ones. Existing (supervised learning) methods often require a large number of paired multi-modal data to train an effective synthesis model. We propose a Multi-scale Transformer Network (MT-Net) with edge-aware pre-training for cross-modality MR image synthesis.
arXiv Detail & Related papers (2022-12-02T11:40:40Z)
Progressively-connected Light Field Network for Efficient View Synthesis [69.29043048775802]
We present a Progressively-connected Light Field network (ProLiF) for the novel view synthesis of complex forward-facing scenes. ProLiF encodes a 4D light field, which allows rendering a large batch of rays in one training step for image- or patch-level losses.
arXiv Detail & Related papers (2022-07-10T13:47:20Z)
Recurrent Multi-view Alignment Network for Unsupervised Surface Registration [79.72086524370819]
Learning non-rigid registration in an end-to-end manner is challenging due to the inherent high degrees of freedom and the lack of labeled training data. We propose to represent the non-rigid transformation with a point-wise combination of several rigid transformations. We also introduce a differentiable loss function that measures the 3D shape similarity on the projected multi-view 2D depth images.
arXiv Detail & Related papers (2020-11-24T14:22:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.