CNNs with Multi-Level Attention for Domain Generalization
- URL: http://arxiv.org/abs/2304.00502v1
- Date: Sun, 2 Apr 2023 10:34:40 GMT
- Title: CNNs with Multi-Level Attention for Domain Generalization
- Authors: Aristotelis Ballas and Christos Diou
- Abstract summary: Deep convolutional neural networks have achieved significant success in image classification and ranking.
Deep convolutional neural networks suffer from performance degradation when neural networks are tested on out-of-distribution scenarios.
We propose an alternative neural network architecture for robust, out-of-distribution image classification.
- Score: 3.1372269816123994
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the past decade, deep convolutional neural networks have achieved
significant success in image classification and ranking and have therefore
found numerous applications in multimedia content retrieval. Still, these
models suffer from performance degradation when neural networks are tested on
out-of-distribution scenarios or on data originating from previously unseen
data Domains. In the present work, we focus on this problem of Domain
Generalization and propose an alternative neural network architecture for
robust, out-of-distribution image classification. We attempt to produce a model
that focuses on the causal features of the depicted class for robust image
classification in the Domain Generalization setting. To achieve this, we
propose attending to multiple-levels of information throughout a Convolutional
Neural Network and leveraging the most important attributes of an image by
employing trainable attention mechanisms. To validate our method, we evaluate
our model on four widely accepted Domain Generalization benchmarks, on which
our model is able to surpass previously reported baselines in three out of four
datasets and achieve the second best score in the fourth one.
Related papers
- Learning to Generalize Unseen Domains via Multi-Source Meta Learning for Text Classification [71.08024880298613]
We study the multi-source Domain Generalization of text classification.
We propose a framework to use multiple seen domains to train a model that can achieve high accuracy in an unseen domain.
arXiv Detail & Related papers (2024-09-20T07:46:21Z) - GM-DF: Generalized Multi-Scenario Deepfake Detection [49.072106087564144]
Existing face forgery detection usually follows the paradigm of training models in a single domain.
In this paper, we elaborately investigate the generalization capacity of deepfake detection models when jointly trained on multiple face forgery detection datasets.
arXiv Detail & Related papers (2024-06-28T17:42:08Z) - Visual Foundation Models Boost Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation [17.875516787157018]
We study how to harness the knowledge priors learned by 2D visual foundation models to produce more accurate labels for unlabeled target domains.
Our method is evaluated on various autonomous driving datasets and the results demonstrate a significant improvement for 3D segmentation task.
arXiv Detail & Related papers (2024-03-15T03:58:17Z) - Multi-Scale and Multi-Layer Contrastive Learning for Domain Generalization [5.124256074746721]
We argue that the generalization ability of deep convolutional neural networks can be improved by taking advantage of multi-layer and multi-scaled representations of the network.
We introduce a framework that aims at improving domain generalization of image classifiers by combining both low-level and high-level features at multiple scales.
We show that our model is able to surpass the performance of previous DG methods and consistently produce competitive and state-of-the-art results in all datasets.
arXiv Detail & Related papers (2023-08-28T08:54:27Z) - Treasure in Distribution: A Domain Randomization based Multi-Source
Domain Generalization for 2D Medical Image Segmentation [20.97329150274455]
We propose a multi-source domain generalization method called Treasure in Distribution (TriD)
TriD constructs an unprecedented search space to obtain the model with strong robustness by randomly sampling from a uniform distribution.
Experiments on two medical segmentation tasks demonstrate that our TriD achieves superior generalization performance on unseen target-domain data.
arXiv Detail & Related papers (2023-05-31T15:33:57Z) - CNN Feature Map Augmentation for Single-Source Domain Generalization [6.053629733936548]
Domain Generalization (DG) has gained significant traction during the past few years.
The goal in DG is to produce models which continue to perform well when presented with data distributions different from the ones available during training.
We propose an alternative regularization technique for convolutional neural network architectures in the single-source DG image classification setting.
arXiv Detail & Related papers (2023-05-26T08:48:17Z) - Discovering Spatial Relationships by Transformers for Domain
Generalization [8.106918528575267]
Domain generalization is a challenging problem thanks to the fast development of AI techniques in computer vision.
Most advanced algorithms are proposed with deep architectures based on convolution neural nets (CNN)
arXiv Detail & Related papers (2021-08-23T10:35:38Z) - Polynomial Networks in Deep Classifiers [55.90321402256631]
We cast the study of deep neural networks under a unifying framework.
Our framework provides insights on the inductive biases of each model.
The efficacy of the proposed models is evaluated on standard image and audio classification benchmarks.
arXiv Detail & Related papers (2021-04-16T06:41:20Z) - Probabilistic Graph Attention Network with Conditional Kernels for
Pixel-Wise Prediction [158.88345945211185]
We present a novel approach that advances the state of the art on pixel-level prediction in a fundamental aspect, i.e. structured multi-scale features learning and fusion.
We propose a probabilistic graph attention network structure based on a novel Attention-Gated Conditional Random Fields (AG-CRFs) model for learning and fusing multi-scale representations in a principled manner.
arXiv Detail & Related papers (2021-01-08T04:14:29Z) - DoFE: Domain-oriented Feature Embedding for Generalizable Fundus Image
Segmentation on Unseen Datasets [96.92018649136217]
We present a novel Domain-oriented Feature Embedding (DoFE) framework to improve the generalization ability of CNNs on unseen target domains.
Our DoFE framework dynamically enriches the image features with additional domain prior knowledge learned from multi-source domains.
Our framework generates satisfying segmentation results on unseen datasets and surpasses other domain generalization and network regularization methods.
arXiv Detail & Related papers (2020-10-13T07:28:39Z) - Adversarial Bipartite Graph Learning for Video Domain Adaptation [50.68420708387015]
Domain adaptation techniques, which focus on adapting models between distributionally different domains, are rarely explored in the video recognition area.
Recent works on visual domain adaptation which leverage adversarial learning to unify the source and target video representations are not highly effective on the videos.
This paper proposes an Adversarial Bipartite Graph (ABG) learning framework which directly models the source-target interactions.
arXiv Detail & Related papers (2020-07-31T03:48:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.