CDFSL-V: Cross-Domain Few-Shot Learning for Videos
- URL: http://arxiv.org/abs/2309.03989v2
- Date: Fri, 15 Sep 2023 17:24:03 GMT
- Title: CDFSL-V: Cross-Domain Few-Shot Learning for Videos
- Authors: Sarinda Samarasinghe, Mamshad Nayeem Rizve, Navid Kardan, Mubarak Shah
- Abstract summary: Few-shot video action recognition is an effective approach to recognizing new categories with only a few labeled examples.
Existing methods in video action recognition rely on large labeled datasets from the same domain.
We propose a novel cross-domain few-shot video action recognition method that leverages self-supervised learning and curriculum learning.
- Score: 58.37446811360741
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Few-shot video action recognition is an effective approach to recognizing new
categories with only a few labeled examples, thereby reducing the challenges
associated with collecting and annotating large-scale video datasets. Existing
methods in video action recognition rely on large labeled datasets from the
same domain. However, this setup is not realistic as novel categories may come
from different data domains that may have different spatial and temporal
characteristics. This dissimilarity between the source and target domains can
pose a significant challenge, rendering traditional few-shot action recognition
techniques ineffective. To address this issue, in this work, we propose a novel
cross-domain few-shot video action recognition method that leverages
self-supervised learning and curriculum learning to balance the information
from the source and target domains. To be particular, our method employs a
masked autoencoder-based self-supervised training objective to learn from both
source and target data in a self-supervised manner. Then a progressive
curriculum balances learning the discriminative information from the source
dataset with the generic information learned from the target domain. Initially,
our curriculum utilizes supervised learning to learn class discriminative
features from the source data. As the training progresses, we transition to
learning target-domain-specific features. We propose a progressive curriculum
to encourage the emergence of rich features in the target domain based on class
discriminative supervised features in the source domain. We evaluate our method
on several challenging benchmark datasets and demonstrate that our approach
outperforms existing cross-domain few-shot learning techniques. Our code is
available at https://github.com/Sarinda251/CDFSL-V
Related papers
- Phrase Grounding-based Style Transfer for Single-Domain Generalized
Object Detection [109.58348694132091]
Single-domain generalized object detection aims to enhance a model's generalizability to multiple unseen target domains.
This is a practical yet challenging task as it requires the model to address domain shift without incorporating target domain data into training.
We propose a novel phrase grounding-based style transfer approach for the task.
arXiv Detail & Related papers (2024-02-02T10:48:43Z) - Pulling Target to Source: A New Perspective on Domain Adaptive Semantic Segmentation [80.1412989006262]
Domain adaptive semantic segmentation aims to transfer knowledge from a labeled source domain to an unlabeled target domain.
We propose T2S-DA, which we interpret as a form of pulling Target to Source for Domain Adaptation.
arXiv Detail & Related papers (2023-05-23T07:09:09Z) - Simplifying Open-Set Video Domain Adaptation with Contrastive Learning [16.72734794723157]
unsupervised video domain adaptation methods have been proposed to adapt a predictive model from a labelled dataset to an unlabelled dataset.
We address a more realistic scenario, called open-set video domain adaptation (OUVDA), where the target dataset contains "unknown" semantic categories that are not shared with the source.
We propose a video-oriented temporal contrastive loss that enables our method to better cluster the feature space by exploiting the freely available temporal information in video data.
arXiv Detail & Related papers (2023-01-09T13:16:50Z) - Few-Shot Object Detection in Unseen Domains [4.36080478413575]
Few-shot object detection (FSOD) has thrived in recent years to learn novel object classes with limited data.
We propose various data augmentations techniques on the few shots of novel classes to account for all possible domain-specific information.
Our experiments on the T-LESS dataset show that the proposed approach succeeds in alleviating the domain gap considerably.
arXiv Detail & Related papers (2022-04-11T13:16:41Z) - Few-Shot Classification in Unseen Domains by Episodic Meta-Learning
Across Visual Domains [36.98387822136687]
Few-shot classification aims to carry out classification given only few labeled examples for the categories of interest.
In this paper, we present a unique learning framework for domain-generalized few-shot classification.
By advancing meta-learning strategies, our learning framework exploits data across multiple source domains to capture domain-invariant features.
arXiv Detail & Related papers (2021-12-27T06:54:11Z) - Domain Adaptive Semantic Segmentation without Source Data [50.18389578589789]
We investigate domain adaptive semantic segmentation without source data, which assumes that the model is pre-trained on the source domain.
We propose an effective framework for this challenging problem with two components: positive learning and negative learning.
Our framework can be easily implemented and incorporated with other methods to further enhance the performance.
arXiv Detail & Related papers (2021-10-13T04:12:27Z) - Surprisingly Simple Semi-Supervised Domain Adaptation with Pretraining
and Consistency [93.89773386634717]
Visual domain adaptation involves learning to classify images from a target visual domain using labels available in a different source domain.
We show that in the presence of a few target labels, simple techniques like self-supervision (via rotation prediction) and consistency regularization can be effective without any adversarial alignment to learn a good target classifier.
Our Pretraining and Consistency (PAC) approach, can achieve state of the art accuracy on this semi-supervised domain adaptation task, surpassing multiple adversarial domain alignment methods, across multiple datasets.
arXiv Detail & Related papers (2021-01-29T18:40:17Z) - Domain Generalized Person Re-Identification via Cross-Domain Episodic
Learning [31.17248105464821]
We present an episodic learning scheme which advances meta learning strategies to exploit the observed source-domain labeled data.
Our experiments on four benchmark datasets confirm the superiority of our method over the state-of-the-arts.
arXiv Detail & Related papers (2020-10-19T14:42:29Z) - A Review of Single-Source Deep Unsupervised Visual Domain Adaptation [81.07994783143533]
Large-scale labeled training datasets have enabled deep neural networks to excel across a wide range of benchmark vision tasks.
In many applications, it is prohibitively expensive and time-consuming to obtain large quantities of labeled data.
To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled or unlabeled target domain.
arXiv Detail & Related papers (2020-09-01T00:06:50Z) - Learning to Cluster under Domain Shift [20.00056591000625]
In this work we address the problem of transferring knowledge from a source to a target domain when both source and target data have no annotations.
Inspired by recent works on deep clustering, our approach leverages information from data gathered from multiple source domains.
We show that our method is able to automatically discover relevant semantic information even in presence of few target samples.
arXiv Detail & Related papers (2020-08-11T12:03:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.