Aligning Correlation Information for Domain Adaptation in Action
Recognition
- URL: http://arxiv.org/abs/2107.04932v1
- Date: Sun, 11 Jul 2021 00:13:36 GMT
- Title: Aligning Correlation Information for Domain Adaptation in Action
Recognition
- Authors: Yuecong Xu, Jianfei Yang, Haozhi Cao, Kezhi Mao, Jianxiong Yin, Simon
See
- Abstract summary: We propose a novel Adversa Correlation Adaptation Network (ACAN) to align action videos by aligning pixel correlations.
ACAN aims to minimize the distribution of correlation information as Pixel Correlation Discrepancy (PCD)
- Score: 14.586677030468339
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Domain adaptation (DA) approaches address domain shift and enable networks to
be applied to different scenarios. Although various image DA approaches have
been proposed in recent years, there is limited research towards video DA. This
is partly due to the complexity in adapting the different modalities of
features in videos, which includes the correlation features extracted as
long-term dependencies of pixels across spatiotemporal dimensions. The
correlation features are highly associated with action classes and proven their
effectiveness in accurate video feature extraction through the supervised
action recognition task. Yet correlation features of the same action would
differ across domains due to domain shift. Therefore we propose a novel
Adversarial Correlation Adaptation Network (ACAN) to align action videos by
aligning pixel correlations. ACAN aims to minimize the distribution of
correlation information, termed as Pixel Correlation Discrepancy (PCD).
Additionally, video DA research is also limited by the lack of cross-domain
video datasets with larger domain shifts. We, therefore, introduce a novel
HMDB-ARID dataset with a larger domain shift caused by a larger statistical
difference between domains. This dataset is built in an effort to leverage
current datasets for dark video classification. Empirical results demonstrate
the state-of-the-art performance of our proposed ACAN for both existing and the
new video DA datasets.
Related papers
- Learning multi-domain feature relation for visible and Long-wave
Infrared image patch matching [39.88037892637296]
We present the largest visible and Long-wave Infrared (LWIR) image patch matching dataset, termed VL-CMIM.
In addition, a multi-domain feature relation learning network (MD-FRN) is proposed.
arXiv Detail & Related papers (2023-08-09T11:23:32Z) - Exploring Few-Shot Adaptation for Activity Recognition on Diverse Domains [46.26074225989355]
Domain adaptation is essential for activity recognition to ensure accurate and robust performance across diverse environments.
In this work, we focus on FewShot Domain Adaptation for Activity Recognition (FSDA-AR), which leverages a very small amount of labeled target videos.
We propose a new FSDA-AR using five established datasets considering the adaptation on more diverse and challenging domains.
arXiv Detail & Related papers (2023-05-15T08:01:05Z) - Unsupervised Video Domain Adaptation for Action Recognition: A
Disentanglement Perspective [37.45565756522847]
We consider the generation of cross-domain videos from two sets of latent factors.
TranSVAE framework is then developed to model such generation.
Experiments on the UCF-HMDB, Jester, and Epic-Kitchens datasets verify the effectiveness and superiority of TranSVAE.
arXiv Detail & Related papers (2022-08-15T17:59:31Z) - Error-Aware Spatial Ensembles for Video Frame Interpolation [50.63021118973639]
Video frame(VFI) algorithms have improved considerably in recent years due to unprecedented progress in both data-driven algorithms and their implementations.
Recent research has introduced advanced motion estimation or novel warping methods as the means to address challenging VFI scenarios.
This work introduces such a solution. By closely examining the correlation between optical flow and IE, the paper proposes novel error prediction metrics that partition the middle frame into distinct regions corresponding to different IE levels.
arXiv Detail & Related papers (2022-07-25T16:15:38Z) - Learning Cross-modal Contrastive Features for Video Domain Adaptation [138.75196499580804]
We propose a unified framework for video domain adaptation, which simultaneously regularizes cross-modal and cross-domain feature representations.
Specifically, we treat each modality in a domain as a view and leverage the contrastive learning technique with properly designed sampling strategies.
arXiv Detail & Related papers (2021-08-26T18:14:18Z) - AFAN: Augmented Feature Alignment Network for Cross-Domain Object
Detection [90.18752912204778]
Unsupervised domain adaptation for object detection is a challenging problem with many real-world applications.
We propose a novel augmented feature alignment network (AFAN) which integrates intermediate domain image generation and domain-adversarial training.
Our approach significantly outperforms the state-of-the-art methods on standard benchmarks for both similar and dissimilar domain adaptations.
arXiv Detail & Related papers (2021-06-10T05:01:20Z) - Adapt Everywhere: Unsupervised Adaptation of Point-Clouds and Entropy
Minimisation for Multi-modal Cardiac Image Segmentation [10.417009344120917]
We present a novel UDA method for multi-modal cardiac image segmentation.
The proposed method is based on adversarial learning and adapts network features between source and target domain in different spaces.
We validated our method on two cardiac datasets by adapting from the annotated source domain to the unannotated target domain.
arXiv Detail & Related papers (2021-03-15T08:59:44Z) - Cross-Domain Facial Expression Recognition: A Unified Evaluation
Benchmark and Adversarial Graph Learning [85.6386289476598]
We develop a novel adversarial graph representation adaptation (AGRA) framework for cross-domain holistic-local feature co-adaptation.
We conduct extensive and fair evaluations on several popular benchmarks and show that the proposed AGRA framework outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2020-08-03T15:00:31Z) - Adversarial Bipartite Graph Learning for Video Domain Adaptation [50.68420708387015]
Domain adaptation techniques, which focus on adapting models between distributionally different domains, are rarely explored in the video recognition area.
Recent works on visual domain adaptation which leverage adversarial learning to unify the source and target video representations are not highly effective on the videos.
This paper proposes an Adversarial Bipartite Graph (ABG) learning framework which directly models the source-target interactions.
arXiv Detail & Related papers (2020-07-31T03:48:41Z) - Learning to Combine: Knowledge Aggregation for Multi-Source Domain
Adaptation [56.694330303488435]
We propose a Learning to Combine for Multi-Source Domain Adaptation (LtC-MSDA) framework.
In the nutshell, a knowledge graph is constructed on the prototypes of various domains to realize the information propagation among semantically adjacent representations.
Our approach outperforms existing methods with a remarkable margin.
arXiv Detail & Related papers (2020-07-17T07:52:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.