The Unreasonable Effectiveness of Large Language-Vision Models for
Source-free Video Domain Adaptation
- URL: http://arxiv.org/abs/2308.09139v2
- Date: Tue, 22 Aug 2023 12:17:15 GMT
- Title: The Unreasonable Effectiveness of Large Language-Vision Models for
Source-free Video Domain Adaptation
- Authors: Giacomo Zara, Alessandro Conti, Subhankar Roy, St\'ephane
Lathuili\`ere, Paolo Rota, Elisa Ricci
- Abstract summary: Source-Free Video Unsupervised Domain Adaptation (SFVUDA) task consists in adapting an action recognition model, trained on a labelled source dataset, to an unlabelled target dataset.
Previous approaches have attempted to address SFVUDA by leveraging self-supervision derived from the target data itself.
We take an approach by exploiting "web-supervision" from Large Language-Vision Models (LLVMs), driven by the rationale that LLVMs contain a rich world prior surprisingly robust to domain-shift.
- Score: 56.61543110071199
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Source-Free Video Unsupervised Domain Adaptation (SFVUDA) task consists in
adapting an action recognition model, trained on a labelled source dataset, to
an unlabelled target dataset, without accessing the actual source data. The
previous approaches have attempted to address SFVUDA by leveraging
self-supervision (e.g., enforcing temporal consistency) derived from the target
data itself. In this work, we take an orthogonal approach by exploiting
"web-supervision" from Large Language-Vision Models (LLVMs), driven by the
rationale that LLVMs contain a rich world prior surprisingly robust to
domain-shift. We showcase the unreasonable effectiveness of integrating LLVMs
for SFVUDA by devising an intuitive and parameter-efficient method, which we
name Domain Adaptation with Large Language-Vision models (DALL-V), that
distills the world prior and complementary source model information into a
student network tailored for the target. Despite the simplicity, DALL-V
achieves significant improvement over state-of-the-art SFVUDA methods.
Related papers
- Learn from the Learnt: Source-Free Active Domain Adaptation via Contrastive Sampling and Visual Persistence [60.37934652213881]
Domain Adaptation (DA) facilitates knowledge transfer from a source domain to a related target domain.
This paper investigates a practical DA paradigm, namely Source data-Free Active Domain Adaptation (SFADA), where source data becomes inaccessible during adaptation.
We present learn from the learnt (LFTL), a novel paradigm for SFADA to leverage the learnt knowledge from the source pretrained model and actively iterated models without extra overhead.
arXiv Detail & Related papers (2024-07-26T17:51:58Z) - Source-Free Domain Adaptation Guided by Vision and Vision-Language Pre-Training [23.56208527227504]
Source-free domain adaptation (SFDA) aims to adapt a source model trained on a fully-labeled source domain to a related but unlabeled target domain.
In the conventional SFDA pipeline, a large data (e.g. ImageNet) pre-trained feature extractor is used to initialize the source model.
We introduce an integrated framework to incorporate pre-trained networks into the target adaptation process.
arXiv Detail & Related papers (2024-05-05T14:48:13Z) - Open-Set Domain Adaptation with Visual-Language Foundation Models [51.49854335102149]
Unsupervised domain adaptation (UDA) has proven to be very effective in transferring knowledge from a source domain to a target domain with unlabeled data.
Open-set domain adaptation (ODA) has emerged as a potential solution to identify these classes during the training phase.
arXiv Detail & Related papers (2023-07-30T11:38:46Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Instance Relation Graph Guided Source-Free Domain Adaptive Object
Detection [79.89082006155135]
Unsupervised Domain Adaptation (UDA) is an effective approach to tackle the issue of domain shift.
UDA methods try to align the source and target representations to improve the generalization on the target domain.
The Source-Free Adaptation Domain (SFDA) setting aims to alleviate these concerns by adapting a source-trained model for the target domain without requiring access to the source data.
arXiv Detail & Related papers (2022-03-29T17:50:43Z) - Source-Free Domain Adaptation for Semantic Segmentation [11.722728148523366]
Unsupervised Domain Adaptation (UDA) can tackle the challenge that convolutional neural network-based approaches for semantic segmentation heavily rely on the pixel-level annotated data.
We propose a source-free domain adaptation framework for semantic segmentation, namely SFDA, in which only a well-trained source model and an unlabeled target domain dataset are available for adaptation.
arXiv Detail & Related papers (2021-03-30T14:14:29Z) - Do We Really Need to Access the Source Data? Source Hypothesis Transfer
for Unsupervised Domain Adaptation [102.67010690592011]
Unsupervised adaptationUDA (UDA) aims to leverage the knowledge learned from a labeled source dataset to solve similar tasks in a new unlabeled domain.
Prior UDA methods typically require to access the source data when learning to adapt the model.
This work tackles a practical setting where only a trained source model is available and how we can effectively utilize such a model without source data to solve UDA problems.
arXiv Detail & Related papers (2020-02-20T03:13:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.