Adversarial Domain Adaptation for Action Recognition Around the Clock
- URL: http://arxiv.org/abs/2210.17412v1
- Date: Tue, 25 Oct 2022 01:08:27 GMT
- Title: Adversarial Domain Adaptation for Action Recognition Around the Clock
- Authors: Anwaar Ulhaq
- Abstract summary: This paper presents a domain adaptation-based action recognition approach.
It uses adversarial learning in cross-domain settings to learn cross-domain action recognition.
It achieves SOTA performance on InFAR and XD145 actions datasets.
- Score: 0.7614628596146599
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Due to the numerous potential applications in visual surveillance and
nighttime driving, recognizing human action in low-light conditions remains a
difficult problem in computer vision. Existing methods separate action
recognition and dark enhancement into two distinct steps to accomplish this
task. However, isolating the recognition and enhancement impedes end-to-end
learning of the space-time representation for video action classification. This
paper presents a domain adaptation-based action recognition approach that uses
adversarial learning in cross-domain settings to learn cross-domain action
recognition. Supervised learning can train it on a large amount of labeled data
from the source domain (daytime action sequences). However, it uses deep domain
invariant features to perform unsupervised learning on many unlabelled data
from the target domain (night-time action sequences). The resulting augmented
model, named 3D-DiNet can be trained using standard backpropagation with an
additional layer. It achieves SOTA performance on InFAR and XD145 actions
datasets.
Related papers
- ActPrompt: In-Domain Feature Adaptation via Action Cues for Video Temporal Grounding [40.60371529725805]
We propose an efficient preliminary in-domain fine-tuning paradigm for feature adaptation.
We introduce Action-Cue-Injected Temporal Prompt Learning (ActPrompt), which injects action cues into the image encoder of VLM for better discovering action-sensitive patterns.
arXiv Detail & Related papers (2024-08-13T04:18:32Z) - ActNetFormer: Transformer-ResNet Hybrid Method for Semi-Supervised Action Recognition in Videos [4.736059095502584]
This work proposes a novel approach using Cross-Architecture Pseudo-Labeling with contrastive learning for semi-supervised action recognition.
We introduce a novel cross-architecture approach where 3D Convolutional Neural Networks (3D CNNs) and video transformers (VIT) are utilised to capture different aspects of action representations.
arXiv Detail & Related papers (2024-04-09T12:09:56Z) - CDFSL-V: Cross-Domain Few-Shot Learning for Videos [58.37446811360741]
Few-shot video action recognition is an effective approach to recognizing new categories with only a few labeled examples.
Existing methods in video action recognition rely on large labeled datasets from the same domain.
We propose a novel cross-domain few-shot video action recognition method that leverages self-supervised learning and curriculum learning.
arXiv Detail & Related papers (2023-09-07T19:44:27Z) - DOAD: Decoupled One Stage Action Detection Network [77.14883592642782]
Localizing people and recognizing their actions from videos is a challenging task towards high-level video understanding.
Existing methods are mostly two-stage based, with one stage for person bounding box generation and the other stage for action recognition.
We present a decoupled one-stage network dubbed DOAD, to improve the efficiency for-temporal action detection.
arXiv Detail & Related papers (2023-04-01T08:06:43Z) - DA-CIL: Towards Domain Adaptive Class-Incremental 3D Object Detection [2.207918236777924]
We propose a novel 3D domain adaptive class-incremental object detection framework, DA-CIL.
We design a novel dual-domain copy-paste augmentation method to construct multiple augmented domains for diversifying training distributions.
Experiments on various datasets demonstrate the effectiveness of the proposed method over baselines.
arXiv Detail & Related papers (2022-12-05T06:45:27Z) - Unsupervised Domain Adaptation for Video Transformers in Action
Recognition [76.31442702219461]
We propose a simple and novel UDA approach for video action recognition.
Our approach builds a robust source model that better generalises to target domain.
We report results on two video action benchmarks recognition for UDA.
arXiv Detail & Related papers (2022-07-26T12:17:39Z) - Audio-Adaptive Activity Recognition Across Video Domains [112.46638682143065]
We leverage activity sounds for domain adaptation as they have less variance across domains and can reliably indicate which activities are not happening.
We propose an audio-adaptive encoder and associated learning methods that discriminatively adjust the visual feature representation.
We also introduce the new task of actor shift, with a corresponding audio-visual dataset, to challenge our method with situations where the activity appearance changes dramatically.
arXiv Detail & Related papers (2022-03-27T08:15:20Z) - Efficient Modelling Across Time of Human Actions and Interactions [92.39082696657874]
We argue that current fixed-sized-temporal kernels in 3 convolutional neural networks (CNNDs) can be improved to better deal with temporal variations in the input.
We study how we can better handle between classes of actions, by enhancing their feature differences over different layers of the architecture.
The proposed approaches are evaluated on several benchmark action recognition datasets and show competitive results.
arXiv Detail & Related papers (2021-10-05T15:39:11Z) - ZSTAD: Zero-Shot Temporal Activity Detection [107.63759089583382]
We propose a novel task setting called zero-shot temporal activity detection (ZSTAD), where activities that have never been seen in training can still be detected.
We design an end-to-end deep network based on R-C3D as the architecture for this solution.
Experiments on both the THUMOS14 and the Charades datasets show promising performance in terms of detecting unseen activities.
arXiv Detail & Related papers (2020-03-12T02:40:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.