Multi-Scale and Multi-Layer Contrastive Learning for Domain Generalization
- URL: http://arxiv.org/abs/2308.14418v5
- Date: Fri, 10 May 2024 08:09:20 GMT
- Title: Multi-Scale and Multi-Layer Contrastive Learning for Domain Generalization
- Authors: Aristotelis Ballas, Christos Diou,
- Abstract summary: We argue that the generalization ability of deep convolutional neural networks can be improved by taking advantage of multi-layer and multi-scaled representations of the network.
We introduce a framework that aims at improving domain generalization of image classifiers by combining both low-level and high-level features at multiple scales.
We show that our model is able to surpass the performance of previous DG methods and consistently produce competitive and state-of-the-art results in all datasets.
- Score: 5.124256074746721
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: During the past decade, deep neural networks have led to fast-paced progress and significant achievements in computer vision problems, for both academia and industry. Yet despite their success, state-of-the-art image classification approaches fail to generalize well in previously unseen visual contexts, as required by many real-world applications. In this paper, we focus on this domain generalization (DG) problem and argue that the generalization ability of deep convolutional neural networks can be improved by taking advantage of multi-layer and multi-scaled representations of the network. We introduce a framework that aims at improving domain generalization of image classifiers by combining both low-level and high-level features at multiple scales, enabling the network to implicitly disentangle representations in its latent space and learn domain-invariant attributes of the depicted objects. Additionally, to further facilitate robust representation learning, we propose a novel objective function, inspired by contrastive learning, which aims at constraining the extracted representations to remain invariant under distribution shifts. We demonstrate the effectiveness of our method by evaluating on the domain generalization datasets of PACS, VLCS, Office-Home and NICO. Through extensive experimentation, we show that our model is able to surpass the performance of previous DG methods and consistently produce competitive and state-of-the-art results in all datasets
Related papers
- MLDGG: Meta-Learning for Domain Generalization on Graphs [9.872254367103057]
Domain generalization on graphs aims to develop models with robust generalization capabilities.
Our framework, MLDGG, endeavors to achieve adaptable generalization across diverse domains by integrating cross-multi-domain meta-learning.
Our empirical results demonstrate that MLDGG surpasses baseline methods, showcasing its effectiveness in three different distribution shift settings.
arXiv Detail & Related papers (2024-11-19T22:57:38Z) - Robust Domain Generalization for Multi-modal Object Recognition [14.128747255526012]
In multi-label classification, machine learning encounters the challenge of domain generalization when handling tasks with differing distributions from the training data.
Recent advancements in vision-language pre-training leverage supervision from extensive visual-language pairs, enabling learning across diverse domains.
This paper proposes solutions by inferring the actual loss, broadening evaluations to larger vision-language backbones, and introducing Mixup-CLIPood.
arXiv Detail & Related papers (2024-08-11T17:13:21Z) - Style-Hallucinated Dual Consistency Learning: A Unified Framework for
Visual Domain Generalization [113.03189252044773]
We propose a unified framework, Style-HAllucinated Dual consistEncy learning (SHADE), to handle domain shift in various visual tasks.
Our versatile SHADE can significantly enhance the generalization in various visual recognition tasks, including image classification, semantic segmentation and object detection.
arXiv Detail & Related papers (2022-12-18T11:42:51Z) - Improving Diversity with Adversarially Learned Transformations for
Domain Generalization [81.26960899663601]
We present a novel framework that uses adversarially learned transformations (ALT) using a neural network to model plausible, yet hard image transformations.
We show that ALT can naturally work with existing diversity modules to produce highly distinct, and large transformations of the source domain leading to state-of-the-art performance.
arXiv Detail & Related papers (2022-06-15T18:05:24Z) - Deep face recognition with clustering based domain adaptation [57.29464116557734]
We propose a new clustering-based domain adaptation method designed for face recognition task in which the source and target domain do not share any classes.
Our method effectively learns the discriminative target feature by aligning the feature domain globally, and, at the meantime, distinguishing the target clusters locally.
arXiv Detail & Related papers (2022-05-27T12:29:11Z) - Exploring Data Aggregation and Transformations to Generalize across
Visual Domains [0.0]
This thesis contributes to research on Domain Generalization (DG), Domain Adaptation (DA) and their variations.
We propose new frameworks for Domain Generalization and Domain Adaptation which make use of feature aggregation strategies and visual transformations.
We show how our proposed solutions outperform competitive state-of-the-art approaches in established DG and DA benchmarks.
arXiv Detail & Related papers (2021-08-20T14:58:14Z) - Explainability-aided Domain Generalization for Image Classification [0.0]
We show that applying methods and architectures from the explainability literature can achieve state-of-the-art performance for the challenging task of domain generalization.
We develop a set of novel algorithms including DivCAM, an approach where the network receives guidance during training via gradient based class activation maps to focus on a diverse set of discriminative features.
Since these methods offer competitive performance on top of explainability, we argue that the proposed methods can be used as a tool to improve the robustness of deep neural network architectures.
arXiv Detail & Related papers (2021-04-05T02:27:01Z) - Joint Learning of Neural Transfer and Architecture Adaptation for Image
Recognition [77.95361323613147]
Current state-of-the-art visual recognition systems rely on pretraining a neural network on a large-scale dataset and finetuning the network weights on a smaller dataset.
In this work, we prove that dynamically adapting network architectures tailored for each domain task along with weight finetuning benefits in both efficiency and effectiveness.
Our method can be easily generalized to an unsupervised paradigm by replacing supernet training with self-supervised learning in the source domain tasks and performing linear evaluation in the downstream tasks.
arXiv Detail & Related papers (2021-03-31T08:15:17Z) - Deep Partial Multi-View Learning [94.39367390062831]
We propose a novel framework termed Cross Partial Multi-View Networks (CPM-Nets)
We fifirst provide a formal defifinition of completeness and versatility for multi-view representation.
We then theoretically prove the versatility of the learned latent representations.
arXiv Detail & Related papers (2020-11-12T02:29:29Z) - Learning from Extrinsic and Intrinsic Supervisions for Domain
Generalization [95.73898853032865]
We present a new domain generalization framework that learns how to generalize across domains simultaneously.
We demonstrate the effectiveness of our approach on two standard object recognition benchmarks.
arXiv Detail & Related papers (2020-07-18T03:12:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.