Learning Shadow Correspondence for Video Shadow Detection
- URL: http://arxiv.org/abs/2208.00150v1
- Date: Sat, 30 Jul 2022 06:30:42 GMT
- Title: Learning Shadow Correspondence for Video Shadow Detection
- Authors: Xinpeng Ding and Jingweng Yang and Xiaowei Hu and Xiaomeng Li
- Abstract summary: We present a novel Shadow-Consistent Correspondence method (SC-Cor) to enhance pixel-wise similarity of the specific shadow regions across frames for video shadow detection.
SC-Cor is a plug-and-play module that can be easily integrated into existing shadow detectors with no extra computational cost.
Experimental results show that SC-Cor outperforms the prior state-of-the-art method, by 6.51% on IoU and 3.35% on the newly introduced temporal stability metric.
- Score: 42.1593380820498
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video shadow detection aims to generate consistent shadow predictions among
video frames. However, the current approaches suffer from inconsistent shadow
predictions across frames, especially when the illumination and background
textures change in a video. We make an observation that the inconsistent
predictions are caused by the shadow feature inconsistency, i.e., the features
of the same shadow regions show dissimilar proprieties among the nearby
frames.In this paper, we present a novel Shadow-Consistent Correspondence
method (SC-Cor) to enhance pixel-wise similarity of the specific shadow regions
across frames for video shadow detection. Our proposed SC-Cor has three main
advantages. Firstly, without requiring the dense pixel-to-pixel correspondence
labels, SC-Cor can learn the pixel-wise correspondence across frames in a
weakly-supervised manner. Secondly, SC-Cor considers intra-shadow separability,
which is robust to the variant textures and illuminations in videos. Finally,
SC-Cor is a plug-and-play module that can be easily integrated into existing
shadow detectors with no extra computational cost. We further design a new
evaluation metric to evaluate the temporal stability of the video shadow
detection results. Experimental results show that SC-Cor outperforms the prior
state-of-the-art method, by 6.51% on IoU and 3.35% on the newly introduced
temporal stability metric.
Related papers
- RenDetNet: Weakly-supervised Shadow Detection with Shadow Caster Verification [15.68136544586505]
Existing shadow detection models struggle to differentiate dark image areas from shadows.
In this paper, we tackle this issue by verifying that all detected shadows are real, i.e. they have paired shadow casters.
We perform this step in a physically-accurate manner by differentiably re-rendering the scene and observing the changes stemming from carving out estimated shadow casters.
Thanks to this approach, the RenDetNet proposed in this paper is the first learning-based shadow detection model whose supervisory signals can be computed in a self-supervised manner.
arXiv Detail & Related papers (2024-08-30T09:34:36Z) - SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection [90.4751446041017]
We present SwinShadow, a transformer-based architecture that fully utilizes the powerful shifted window mechanism for detecting adjacent shadows.
The whole process can be divided into three parts: encoder, decoder, and feature integration.
Experiments on three shadow detection benchmark datasets, SBU, UCF, and ISTD, demonstrate that our network achieves good performance in terms of balance error rate (BER)
arXiv Detail & Related papers (2024-08-07T03:16:33Z) - Detect Any Shadow: Segment Anything for Video Shadow Detection [105.19693622157462]
We propose ShadowSAM, a framework for fine-tuning segment anything model (SAM) to detect shadows.
By combining it with long short-term attention mechanism, we extend its capability for efficient video shadow detection.
Our method exhibits accelerated inference speed compared to previous video shadow detection approaches.
arXiv Detail & Related papers (2023-05-26T07:39:10Z) - ShaDocNet: Learning Spatial-Aware Tokens in Transformer for Document
Shadow Removal [53.01990632289937]
We propose a Transformer-based model for document shadow removal.
It uses shadow context encoding and decoding in both shadow and shadow-free regions.
arXiv Detail & Related papers (2022-11-30T01:46:29Z) - SCOTCH and SODA: A Transformer Video Shadow Detection Framework [12.42397422225366]
Shadows in videos are difficult to detect because of the large shadow deformation between frames.
We introduce the shadow deformation attention trajectory (SODA), a new type of video self-attention module.
We also present a new shadow contrastive learning mechanism (SCOTCH) which aims at guiding the network to learn a unified shadow representation.
arXiv Detail & Related papers (2022-11-13T12:23:07Z) - Controllable Shadow Generation Using Pixel Height Maps [58.59256060452418]
Physics-based shadow rendering methods require 3D geometries, which are not always available.
Deep learning-based shadow synthesis methods learn a mapping from the light information to an object's shadow without explicitly modeling the shadow geometry.
We introduce pixel heigh, a novel geometry representation that encodes the correlations between objects, ground, and camera pose.
arXiv Detail & Related papers (2022-07-12T08:29:51Z) - R2D: Learning Shadow Removal to Enhance Fine-Context Shadow Detection [64.10636296274168]
Current shadow detection methods perform poorly when detecting shadow regions that are small, unclear or have blurry edges.
We propose a new method called Restore to Detect (R2D), where a deep neural network is trained for restoration (shadow removal)
We show that our proposed method R2D improves the shadow detection performance while being able to detect fine context better compared to the other recent methods.
arXiv Detail & Related papers (2021-09-20T15:09:22Z) - Temporal Feature Warping for Video Shadow Detection [30.82493923485278]
We propose a simple but powerful method to better aggregate information temporally.
We use an optical flow based warping module to align and then combine features between frames.
We apply this warping module across multiple deep-network layers to retrieve information from neighboring frames including both local details and high-level semantic information.
arXiv Detail & Related papers (2021-07-29T19:12:50Z) - Triple-cooperative Video Shadow Detection [43.030759888063194]
We collect a new video shadow detection dataset, which contains 120 videos with 11, 685 frames, covering 60 object categories, varying lengths, and different motion/lighting conditions.
We also develop a new baseline model, named triple-cooperative video shadow detection network (TVSD-Net)
Within the network, a dual gated co-attention module is proposed to constrain features from neighboring frames in the same video, while an auxiliary similarity loss is introduced to mine semantic information between different videos.
arXiv Detail & Related papers (2021-03-11T08:54:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.