WavShadow: Wavelet Based Shadow Segmentation and Removal
- URL: http://arxiv.org/abs/2411.05747v3
- Date: Tue, 12 Nov 2024 16:42:02 GMT
- Title: WavShadow: Wavelet Based Shadow Segmentation and Removal
- Authors: Shreyans Jain, Viraj Vekaria, Karan Gandhi, Aadya Arora,
- Abstract summary: This study presents a novel approach that enhances the ShadowFormer model by incorporating Masked Autoencoder (MAE) priors and Fast Fourier Convolution (FFC) blocks.
We introduce key innovations: (1) integration of MAE priors trained on Places2 dataset for better context understanding, (2) adoption of Haar wavelet features for enhanced edge detection and multiscale analysis, and (3) implementation of a modified SAM Adapter for robust shadow segmentation.
- Score: 0.0
- License:
- Abstract: Shadow removal and segmentation remain challenging tasks in computer vision, particularly in complex real world scenarios. This study presents a novel approach that enhances the ShadowFormer model by incorporating Masked Autoencoder (MAE) priors and Fast Fourier Convolution (FFC) blocks, leading to significantly faster convergence and improved performance. We introduce key innovations: (1) integration of MAE priors trained on Places2 dataset for better context understanding, (2) adoption of Haar wavelet features for enhanced edge detection and multiscale analysis, and (3) implementation of a modified SAM Adapter for robust shadow segmentation. Extensive experiments on the challenging DESOBA dataset demonstrate that our approach achieves state of the art results, with notable improvements in both convergence speed and shadow removal quality.
Related papers
- MCGS: Multiview Consistency Enhancement for Sparse-View 3D Gaussian Radiance Fields [73.49548565633123]
Radiance fields represented by 3D Gaussians excel at synthesizing novel views, offering both high training efficiency and fast rendering.
Existing methods often incorporate depth priors from dense estimation networks but overlook the inherent multi-view consistency in input images.
We propose a view framework based on 3D Gaussian Splatting, named MCGS, enabling scene reconstruction from sparse input views.
arXiv Detail & Related papers (2024-10-15T08:39:05Z) - SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection [90.4751446041017]
We present SwinShadow, a transformer-based architecture that fully utilizes the powerful shifted window mechanism for detecting adjacent shadows.
The whole process can be divided into three parts: encoder, decoder, and feature integration.
Experiments on three shadow detection benchmark datasets, SBU, UCF, and ISTD, demonstrate that our network achieves good performance in terms of balance error rate (BER)
arXiv Detail & Related papers (2024-08-07T03:16:33Z) - A Two-Stage Progressive Pre-training using Multi-Modal Contrastive Masked Autoencoders [5.069884983892437]
We propose a new progressive pre-training method for image understanding tasks which leverages RGB-D datasets.
In the first stage, we pre-train the model using contrastive learning to learn cross-modal representations.
In the second stage, we further pre-train the model using masked autoencoding and denoising/noise prediction.
Our approach is scalable, robust and suitable for pre-training RGB-D datasets.
arXiv Detail & Related papers (2024-08-05T05:33:59Z) - S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR)
Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection.
In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z) - Progressive Recurrent Network for Shadow Removal [99.1928825224358]
Single-image shadow removal is a significant task that is still unresolved.
Most existing deep learning-based approaches attempt to remove the shadow directly, which can not deal with the shadow well.
We propose a simple but effective Progressive Recurrent Network (PRNet) to remove the shadow progressively.
arXiv Detail & Related papers (2023-11-01T11:42:45Z) - Deshadow-Anything: When Segment Anything Model Meets Zero-shot shadow
removal [8.555176637147648]
We develop Deshadow-Anything, considering the generalization of large-scale datasets, to achieve image shadow removal.
The diffusion model can diffuse along the edges and textures of an image, helping to remove shadows while preserving the details of the image.
Experiments on shadow removal tasks demonstrate that these methods can effectively improve image restoration performance.
arXiv Detail & Related papers (2023-09-21T01:35:13Z) - Revisiting the Encoding of Satellite Image Time Series [2.5874041837241304]
Image Time Series (SITS)temporal learning is complex due to hightemporal resolutions and irregular acquisition times.
We develop a novel perspective of SITS processing as a direct set prediction problem, inspired by the recent trend in adopting query-based transformer decoders.
We attain new state-of-the-art (SOTA) results on the Satellite PASTIS benchmark dataset.
arXiv Detail & Related papers (2023-05-03T12:44:20Z) - SatMAE: Pre-training Transformers for Temporal and Multi-Spectral
Satellite Imagery [74.82821342249039]
We present SatMAE, a pre-training framework for temporal or multi-spectral satellite imagery based on Masked Autoencoder (MAE)
To leverage temporal information, we include a temporal embedding along with independently masking image patches across time.
arXiv Detail & Related papers (2022-07-17T01:35:29Z) - Synthetic Convolutional Features for Improved Semantic Segmentation [139.5772851285601]
We suggest to generate intermediate convolutional features and propose the first synthesis approach that is catered to such intermediate convolutional features.
This allows us to generate new features from label masks and include them successfully into the training procedure.
Experimental results and analysis on two challenging datasets Cityscapes and ADE20K show that our generated feature improves performance on segmentation tasks.
arXiv Detail & Related papers (2020-09-18T14:12:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.