Day2Dark: Pseudo-Supervised Activity Recognition beyond Silent Daylight
- URL: http://arxiv.org/abs/2212.02053v3
- Date: Sun, 27 Aug 2023 19:41:53 GMT
- Title: Day2Dark: Pseudo-Supervised Activity Recognition beyond Silent Daylight
- Authors: Yunhua Zhang and Hazel Doughty and Cees G. M. Snoek
- Abstract summary: State-of-the-art activity recognizers are effective during the day, but not trustworthy in the dark.
We introduce a pseudo-supervised learning scheme, which utilizes easy to obtain unlabeled and task-irrelevant dark videos to improve an activity recognizer in low light.
Since the usefulness of audio and visual features differs depending on the amount of illumination, we introduce our darkness-adaptive' audio-visual recognizer.
- Score: 54.23533023883659
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper strives to recognize activities in the dark, as well as in the
day. We first establish that state-of-the-art activity recognizers are
effective during the day, but not trustworthy in the dark. The main causes are
the limited availability of labeled dark videos to learn from, as well as the
distribution shift towards the lower color contrast at test-time. To compensate
for the lack of labeled dark videos, we introduce a pseudo-supervised learning
scheme, which utilizes easy to obtain unlabeled and task-irrelevant dark videos
to improve an activity recognizer in low light. As the lower color contrast
results in visual information loss, we further propose to incorporate the
complementary activity information within audio, which is invariant to
illumination. Since the usefulness of audio and visual features differs
depending on the amount of illumination, we introduce our `darkness-adaptive'
audio-visual recognizer. Experiments on EPIC-Kitchens, Kinetics-Sound, and
Charades demonstrate our proposals are superior to image enhancement, domain
adaptation and alternative audio-visual fusion methods, and can even improve
robustness to local darkness caused by occlusions. Project page:
https://xiaobai1217.github.io/Day2Dark/
Related papers
- Multiple Latent Space Mapping for Compressed Dark Image Enhancement [51.112925890246444]
Existing dark image enhancement methods take compressed dark images as inputs and achieve great performance.
We propose a novel latent mapping network based on variational auto-encoder (VAE)
Comprehensive experiments demonstrate that the proposed method achieves state-of-the-art performance in compressed dark image enhancement.
arXiv Detail & Related papers (2024-03-12T13:05:51Z) - Enhancing Visibility in Nighttime Haze Images Using Guided APSF and
Gradient Adaptive Convolution [28.685126418090338]
Existing nighttime dehazing methods often struggle with handling glow or low-light conditions.
In this paper, we enhance the visibility from a single nighttime haze image by suppressing glow and enhancing low-light regions.
Our method achieves a PSNR of 30.38dB, outperforming state-of-the-art methods by 13% on GTA5 nighttime haze dataset.
arXiv Detail & Related papers (2023-08-03T12:58:23Z) - Disentangled Contrastive Image Translation for Nighttime Surveillance [87.03178320662592]
Nighttime surveillance suffers from degradation due to poor illumination and arduous human annotations.
Existing methods rely on multi-spectral images to perceive objects in the dark, which are troubled by low resolution and color absence.
We argue that the ultimate solution for nighttime surveillance is night-to-day translation, or Night2Day.
This paper contributes a new surveillance dataset called NightSuR. It includes six scenes to support the study on nighttime surveillance.
arXiv Detail & Related papers (2023-07-11T06:40:27Z) - Soundini: Sound-Guided Diffusion for Natural Video Editing [29.231939578629785]
We propose a method for adding sound-guided visual effects to specific regions of videos with a zero-shot setting.
Our work is the first to explore sound-guided natural video editing from various sound sources with sound-specialized properties.
arXiv Detail & Related papers (2023-04-13T20:56:53Z) - Egocentric Audio-Visual Noise Suppression [11.113020254726292]
This paper studies audio-visual noise suppression for egocentric videos.
Video camera emulates off-screen speaker's view of the outside world.
We first demonstrate that egocentric visual information is helpful for noise suppression.
arXiv Detail & Related papers (2022-11-07T15:53:12Z) - Weakly-Supervised Action Detection Guided by Audio Narration [50.4318060593995]
We propose a model to learn from the narration supervision and utilize multimodal features, including RGB, motion flow, and ambient sound.
Our experiments show that noisy audio narration suffices to learn a good action detection model, thus reducing annotation expenses.
arXiv Detail & Related papers (2022-05-12T06:33:24Z) - OWL (Observe, Watch, Listen): Localizing Actions in Egocentric Video via
Audiovisual Temporal Context [58.932717614439916]
We take a deep look into the effectiveness of audio in detecting actions in egocentric videos.
We propose a transformer-based model to incorporate temporal audio-visual context.
Our approach achieves state-of-the-art performance on EPIC-KITCHENS-100.
arXiv Detail & Related papers (2022-02-10T10:50:52Z) - Relighting Images in the Wild with a Self-Supervised Siamese
Auto-Encoder [62.580345486483886]
We propose a self-supervised method for image relighting of single view images in the wild.
The method is based on an auto-encoder which deconstructs an image into two separate encodings.
We train our model on large-scale datasets such as Youtube 8M and CelebA.
arXiv Detail & Related papers (2020-12-11T16:08:50Z) - ARID: A New Dataset for Recognizing Action in the Dark [19.010874017607247]
This paper explores the task of action recognition in dark videos.
It consists of over 3,780 video clips with 11 action categories.
To the best of our knowledge, it is the first dataset focused on human actions in dark videos.
arXiv Detail & Related papers (2020-06-06T14:25:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.