Can domain adaptation make object recognition work for everyone?
- URL: http://arxiv.org/abs/2204.11122v1
- Date: Sat, 23 Apr 2022 18:51:13 GMT
- Title: Can domain adaptation make object recognition work for everyone?
- Authors: Viraj Prabhu, Ramprasaath R. Selvaraju, Judy Hoffman, Nikhil Naik
- Abstract summary: Modern computer vision datasets significantly overrepresent the developed world and models trained on such datasets underperform on images from unseen geographies.
We investigate the effectiveness of unsupervised domain adaptation (UDA) of such models across geographies at closing this performance gap.
We demonstrate the inefficacy of standard DA methods at Geographical DA, highlighting the need for specialized geographical adaptation solutions.
- Score: 18.930805872127028
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the rapid progress in deep visual recognition, modern computer vision
datasets significantly overrepresent the developed world and models trained on
such datasets underperform on images from unseen geographies. We investigate
the effectiveness of unsupervised domain adaptation (UDA) of such models across
geographies at closing this performance gap. To do so, we first curate two
shifts from existing datasets to study the Geographical DA problem, and
discover new challenges beyond data distribution shift: context shift, wherein
object surroundings may change significantly across geographies, and
subpopulation shift, wherein the intra-category distributions may shift. We
demonstrate the inefficacy of standard DA methods at Geographical DA,
highlighting the need for specialized geographical adaptation solutions to
address the challenge of making object recognition work for everyone.
Related papers
- Out-of-Distribution Detection on Graphs: A Survey [58.47395497985277]
Graph out-of-distribution (GOOD) detection focuses on identifying graph data that deviates from the distribution seen during training.
We categorize existing methods into four types: enhancement-based, reconstruction-based, information propagation-based, and classification-based approaches.
We discuss practical applications and theoretical foundations, highlighting the unique challenges posed by graph data.
arXiv Detail & Related papers (2025-02-12T04:07:12Z) - HiGDA: Hierarchical Graph of Nodes to Learn Local-to-Global Topology for Semi-Supervised Domain Adaptation [0.18749305679160366]
We introduce a Hierarchical Graph of Nodes designed to simultaneously present representations at both feature and category levels.
In this study, we introduce a local graph to identify the most relevant patches within an image, facilitating adaptability to defined main object representations.
At the category level, we employ a global graph to aggregate the features from samples within the same category, thereby enriching overall representations.
arXiv Detail & Related papers (2024-12-16T14:35:52Z) - World-Consistent Data Generation for Vision-and-Language Navigation [52.08816337783936]
Vision-and-Language Navigation (VLN) is a challenging task that requires an agent to navigate through photorealistic environments following natural-language instructions.
One main obstacle existing in VLN is data scarcity, leading to poor generalization performance over unseen environments.
We propose the world-consistent data generation (WCGEN), an efficacious data-augmentation framework satisfying both diversity and world-consistency.
arXiv Detail & Related papers (2024-12-09T11:40:54Z) - GeoNet: Benchmarking Unsupervised Adaptation across Geographies [71.23141626803287]
We study the problem of geographic robustness and make three main contributions.
First, we introduce a large-scale dataset GeoNet for geographic adaptation.
Second, we hypothesize that the major source of domain shifts arise from significant variations in scene context.
Third, we conduct an extensive evaluation of several state-of-the-art unsupervised domain adaptation algorithms and architectures.
arXiv Detail & Related papers (2023-03-27T17:59:34Z) - A General Purpose Neural Architecture for Geospatial Systems [142.43454584836812]
We present a roadmap towards the construction of a general-purpose neural architecture (GPNA) with a geospatial inductive bias.
We envision how such a model may facilitate cooperation between members of the community.
arXiv Detail & Related papers (2022-11-04T09:58:57Z) - Deep face recognition with clustering based domain adaptation [57.29464116557734]
We propose a new clustering-based domain adaptation method designed for face recognition task in which the source and target domain do not share any classes.
Our method effectively learns the discriminative target feature by aligning the feature domain globally, and, at the meantime, distinguishing the target clusters locally.
arXiv Detail & Related papers (2022-05-27T12:29:11Z) - Exploring Data Aggregation and Transformations to Generalize across
Visual Domains [0.0]
This thesis contributes to research on Domain Generalization (DG), Domain Adaptation (DA) and their variations.
We propose new frameworks for Domain Generalization and Domain Adaptation which make use of feature aggregation strategies and visual transformations.
We show how our proposed solutions outperform competitive state-of-the-art approaches in established DG and DA benchmarks.
arXiv Detail & Related papers (2021-08-20T14:58:14Z) - Domain Adaptation with Incomplete Target Domains [61.68950959231601]
We propose an Incomplete Data Imputation based Adversarial Network (IDIAN) model to address this new domain adaptation challenge.
In the proposed model, we design a data imputation module to fill the missing feature values based on the partial observations in the target domain.
We conduct experiments on both cross-domain benchmark tasks and a real world adaptation task with imperfect target domains.
arXiv Detail & Related papers (2020-12-03T00:07:40Z) - DASGIL: Domain Adaptation for Semantic and Geometric-aware Image-based
Localization [27.294822556484345]
Long-term visual localization under changing environments is a challenging problem in autonomous driving and mobile robotics.
We propose a novel multi-task architecture to fuse the geometric and semantic information into the multi-scale latent embedding representation for visual place recognition.
arXiv Detail & Related papers (2020-10-01T17:44:25Z) - Meta-Learning for Few-Shot Land Cover Classification [3.8529010979482123]
We evaluate the model-agnostic meta-learning (MAML) algorithm on classification and segmentation tasks.
We find that few-shot model adaptation outperforms pre-training with regular gradient descent.
This indicates that model optimization with meta-learning may benefit tasks in the Earth sciences.
arXiv Detail & Related papers (2020-04-28T09:42:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.