Domain-Aware Continual Zero-Shot Learning
- URL: http://arxiv.org/abs/2112.12989v3
- Date: Tue, 12 Mar 2024 14:47:47 GMT
- Title: Domain-Aware Continual Zero-Shot Learning
- Authors: Kai Yi, Paul Janson, Wenxuan Zhang, Mohamed Elhoseiny
- Abstract summary: Domain-Aware Continual Zero-Shot Learning (DACZSL) is a task to recognize images of unseen categories in continuously changing domains.
We propose a Domain-Invariant Network (DIN) to learn factorized features for shifting domains and improved textual representation for unseen classes.
Our results show that DIN significantly outperforms existing baselines by over 5% in harmonic accuracy and over 1% in backward transfer.
- Score: 52.349332188116975
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern visual systems have a wide range of potential applications in vision
tasks for natural science research, such as aiding in species discovery,
monitoring animals in the wild, and so on. However, real-world vision tasks may
experience changes in environmental conditions, leading to shifts in how
captured images are presented. To address this issue, we introduce Domain-Aware
Continual Zero-Shot Learning (DACZSL), a task to recognize images of unseen
categories in continuously changing domains. Accordingly, we propose a
Domain-Invariant Network (DIN) to learn factorized features for shifting
domains and improved textual representation for unseen classes. DIN continually
learns a global shared network for domain-invariant and task-invariant
features, and per-task private networks for task-specific features.
Furthermore, we enhance the dual network with class-wise learnable prompts to
improve class-level text representation, thereby improving zero-shot prediction
of future unseen classes. To evaluate DACZSL, we introduce two benchmarks,
DomainNet-CZSL and iWildCam-CZSL. Our results show that DIN significantly
outperforms existing baselines by over 5% in harmonic accuracy and over 1% in
backward transfer and achieves a new SoTA.
Related papers
- In the Era of Prompt Learning with Vision-Language Models [1.060608983034705]
We introduce textscStyLIP, a novel domain-agnostic prompt learning strategy for Domain Generalization (DG)
StyLIP disentangles visual style and content in CLIPs vision encoder by using style projectors to learn domain-specific prompt tokens.
We also propose AD-CLIP for unsupervised domain adaptation (DA), leveraging CLIPs frozen vision backbone.
arXiv Detail & Related papers (2024-11-07T17:31:21Z) - Grounding Stylistic Domain Generalization with Quantitative Domain Shift Measures and Synthetic Scene Images [63.58800688320182]
Domain Generalization is a challenging task in machine learning.
Current methodology lacks quantitative understanding about shifts in stylistic domain.
We introduce a new DG paradigm to address these risks.
arXiv Detail & Related papers (2024-05-24T22:13:31Z) - Domain-Controlled Prompt Learning [49.45309818782329]
Existing prompt learning methods often lack domain-awareness or domain-transfer mechanisms.
We propose a textbfDomain-Controlled Prompt Learning for the specific domains.
Our method achieves state-of-the-art performance in specific domain image recognition datasets.
arXiv Detail & Related papers (2023-09-30T02:59:49Z) - Multi-Scale and Multi-Layer Contrastive Learning for Domain Generalization [5.124256074746721]
We argue that the generalization ability of deep convolutional neural networks can be improved by taking advantage of multi-layer and multi-scaled representations of the network.
We introduce a framework that aims at improving domain generalization of image classifiers by combining both low-level and high-level features at multiple scales.
We show that our model is able to surpass the performance of previous DG methods and consistently produce competitive and state-of-the-art results in all datasets.
arXiv Detail & Related papers (2023-08-28T08:54:27Z) - Using Language to Extend to Unseen Domains [81.37175826824625]
It is expensive to collect training data for every possible domain that a vision model may encounter when deployed.
We consider how simply verbalizing the training domain as well as domains we want to extend to but do not have data for can improve robustness.
Using a multimodal model with a joint image and language embedding space, our method LADS learns a transformation of the image embeddings from the training domain to each unseen test domain.
arXiv Detail & Related papers (2022-10-18T01:14:02Z) - Domain Invariant Masked Autoencoders for Self-supervised Learning from
Multi-domains [73.54897096088149]
We propose a Domain-invariant Masked AutoEncoder (DiMAE) for self-supervised learning from multi-domains.
The core idea is to augment the input image with style noise from different domains and then reconstruct the image from the embedding of the augmented image.
Experiments on PACS and DomainNet illustrate that DiMAE achieves considerable gains compared with recent state-of-the-art methods.
arXiv Detail & Related papers (2022-05-10T09:49:40Z) - Towards Recognizing Unseen Categories in Unseen Domains [74.29101415077523]
CuMix is a holistic algorithm to tackle Zero-Shot Learning (ZSL), Domain Adaptation and Domain Generalization (DG) and ZSL+DG.
The key idea of CuMix is to simulate the test-time domain and semantic shift using images and features from unseen domains and categories available during training.
Results on standard SL and DG datasets and on ZSL+DG using the DomainNet benchmark demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-07-23T21:09:28Z) - Learning to adapt class-specific features across domains for semantic
segmentation [36.36210909649728]
In this thesis, we present a novel architecture, which learns to adapt features across domains by taking into account per class information.
We adopt the recently introduced StarGAN architecture as image translation backbone, since it is able to perform translations across multiple domains by means of a single generator network.
arXiv Detail & Related papers (2020-01-22T23:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.