Domain Generalization through Audio-Visual Relative Norm Alignment in
First Person Action Recognition
- URL: http://arxiv.org/abs/2110.10101v1
- Date: Tue, 19 Oct 2021 16:52:39 GMT
- Title: Domain Generalization through Audio-Visual Relative Norm Alignment in
First Person Action Recognition
- Authors: Mirco Planamente, Chiara Plizzari, Emanuele Alberti, Barbara Caputo
- Abstract summary: First person action recognition is becoming an increasingly researched area thanks to the rising popularity of wearable cameras.
This is bringing to light cross-domain issues that are yet to be addressed in this context.
We introduce the first domain generalization approach for egocentric activity recognition, by proposing a new audio-visual loss.
- Score: 15.545769463854915
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: First person action recognition is becoming an increasingly researched area
thanks to the rising popularity of wearable cameras. This is bringing to light
cross-domain issues that are yet to be addressed in this context. Indeed, the
information extracted from learned representations suffers from an intrinsic
"environmental bias". This strongly affects the ability to generalize to unseen
scenarios, limiting the application of current methods to real settings where
labeled data are not available during training. In this work, we introduce the
first domain generalization approach for egocentric activity recognition, by
proposing a new audio-visual loss, called Relative Norm Alignment loss. It
re-balances the contributions from the two modalities during training, over
different domains, by aligning their feature norm representations. Our approach
leads to strong results in domain generalization on both EPIC-Kitchens-55 and
EPIC-Kitchens-100, as demonstrated by extensive experiments, and can be
extended to work also on domain adaptation settings with competitive results.
Related papers
- A2XP: Towards Private Domain Generalization [0.0]
eXpert Prompts (A2XP) is a novel approach for domain generalization that preserves the privacy and integrity of the network architecture.
Our experiments demonstrate that A2XP achieves state-of-the-art results over existing non-private domain generalization methods.
arXiv Detail & Related papers (2023-11-17T05:49:50Z) - From Denoising Training to Test-Time Adaptation: Enhancing Domain
Generalization for Medical Image Segmentation [8.36463803956324]
We propose the Denoising Y-Net (DeY-Net), a novel approach incorporating an auxiliary denoising decoder into the basic U-Net architecture.
The auxiliary decoder aims to perform denoising training, augmenting the domain-invariant representation that facilitates domain generalization.
Building upon denoising training, we propose Denoising Test Time Adaptation (DeTTA) that further: (i) adapts the model to the target domain in a sample-wise manner, and (ii) adapts to the noise-corrupted input.
arXiv Detail & Related papers (2023-10-31T08:39:15Z) - NormAUG: Normalization-guided Augmentation for Domain Generalization [60.159546669021346]
We propose a simple yet effective method called NormAUG (Normalization-guided Augmentation) for deep learning.
Our method introduces diverse information at the feature level and improves the generalization of the main path.
In the test stage, we leverage an ensemble strategy to combine the predictions from the auxiliary path of our model, further boosting performance.
arXiv Detail & Related papers (2023-07-25T13:35:45Z) - PoliTO-IIT-CINI Submission to the EPIC-KITCHENS-100 Unsupervised Domain
Adaptation Challenge for Action Recognition [16.496889090237232]
This report describes the technical details of our submission to the EPIC-Kitchens-100 Unsupervised Domain Adaptation Challenge in Action Recognition.
We first exploited a recent Domain Generalization technique, called Relative Norm Alignment (RNA)
Secondly, we extended this approach to work on unlabelled target data, enabling a simpler adaptation of the model to the target distribution in an unsupervised fashion.
arXiv Detail & Related papers (2022-09-09T21:03:11Z) - Domain Adaptation with Adversarial Training on Penultimate Activations [82.9977759320565]
Enhancing model prediction confidence on unlabeled target data is an important objective in Unsupervised Domain Adaptation (UDA)
We show that this strategy is more efficient and better correlated with the objective of boosting prediction confidence than adversarial training on input images or intermediate features.
arXiv Detail & Related papers (2022-08-26T19:50:46Z) - Localized Adversarial Domain Generalization [83.4195658745378]
Adversarial domain generalization is a popular approach to domain generalization.
We propose localized adversarial domain generalization with space compactness maintenance(LADG)
We conduct comprehensive experiments on the Wilds DG benchmark to validate our approach.
arXiv Detail & Related papers (2022-05-09T08:30:31Z) - Towards Online Domain Adaptive Object Detection [79.89082006155135]
Existing object detection models assume both the training and test data are sampled from the same source domain.
We propose a novel unified adaptation framework that adapts and improves generalization on the target domain in online settings.
arXiv Detail & Related papers (2022-04-11T17:47:22Z) - Cross-Domain First Person Audio-Visual Action Recognition through
Relative Norm Alignment [15.545769463854915]
First person action recognition is an increasingly researched topic because of the growing popularity of wearable cameras.
This is bringing to light cross-domain issues that are yet to be addressed in this context.
We propose to leverage over the intrinsic complementary nature of audio-visual signals to learn a representation that works well on data seen during training.
arXiv Detail & Related papers (2021-06-03T08:46:43Z) - A Fourier-based Framework for Domain Generalization [82.54650565298418]
Domain generalization aims at tackling this problem by learning transferable knowledge from multiple source domains in order to generalize to unseen target domains.
This paper introduces a novel Fourier-based perspective for domain generalization.
Experiments on three benchmarks have demonstrated that the proposed method is able to achieve state-of-the-arts performance for domain generalization.
arXiv Detail & Related papers (2021-05-24T06:50:30Z) - Phase Consistent Ecological Domain Adaptation [76.75730500201536]
We focus on the task of semantic segmentation, where annotated synthetic data are aplenty, but annotating real data is laborious.
The first criterion, inspired by visual psychophysics, is that the map between the two image domains be phase-preserving.
The second criterion aims to leverage ecological statistics, or regularities in the scene which are manifest in any image of it, regardless of the characteristics of the illuminant or the imaging sensor.
arXiv Detail & Related papers (2020-04-10T06:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.