On the Challenges of Open World Recognitionunder Shifting Visual Domains
- URL: http://arxiv.org/abs/2107.04461v1
- Date: Fri, 9 Jul 2021 14:25:45 GMT
- Title: On the Challenges of Open World Recognitionunder Shifting Visual Domains
- Authors: Dario Fontanel, Fabio Cermelli, Massimiliano Mancini, Barbara Caputo
- Abstract summary: This work investigates whether Open World Recognition (OWR) algorithms are effective under domain-shift.
OWR has the goal to produce systems capable of breaking the semantic limits present in the initial training set.
Our analysis shows that this degradation is only slightly mitigated by coupling OWR with domain generalization techniques.
- Score: 23.999211737485812
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robotic visual systems operating in the wild must act in unconstrained
scenarios, under different environmental conditions while facing a variety of
semantic concepts, including unknown ones. To this end, recent works tried to
empower visual object recognition methods with the capability to i) detect
unseen concepts and ii) extended their knowledge over time, as images of new
semantic classes arrive. This setting, called Open World Recognition (OWR), has
the goal to produce systems capable of breaking the semantic limits present in
the initial training set. However, this training set imposes to the system not
only its own semantic limits, but also environmental ones, due to its bias
toward certain acquisition conditions that do not necessarily reflect the high
variability of the real-world. This discrepancy between training and test
distribution is called domain-shift. This work investigates whether OWR
algorithms are effective under domain-shift, presenting the first benchmark
setup for assessing fairly the performances of OWR algorithms, with and without
domain-shift. We then use this benchmark to conduct analyses in various
scenarios, showing how existing OWR algorithms indeed suffer a severe
performance degradation when train and test distributions differ. Our analysis
shows that this degradation is only slightly mitigated by coupling OWR with
domain generalization techniques, indicating that the mere plug-and-play of
existing algorithms is not enough to recognize new and unknown categories in
unseen domains. Our results clearly point toward open issues and future
research directions, that need to be investigated for building robot visual
systems able to function reliably under these challenging yet very real
conditions. Code available at
https://github.com/DarioFontanel/OWR-VisualDomains
Related papers
- Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shifts [56.57141696245328]
In open-world scenarios, where both novel classes and domains may exist, an ideal segmentation model should detect anomaly classes for safety.
Existing methods often struggle to distinguish between domain-level and semantic-level distribution shifts.
arXiv Detail & Related papers (2024-11-06T11:03:02Z) - GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features [68.14842693208465]
GeneralAD is an anomaly detection framework designed to operate in semantic, near-distribution, and industrial settings.
We propose a novel self-supervised anomaly generation module that employs straightforward operations like noise addition and shuffling to patch features.
We extensively evaluated our approach on ten datasets, achieving state-of-the-art results in six and on-par performance in the remaining.
arXiv Detail & Related papers (2024-07-17T09:27:41Z) - VALUED -- Vision and Logical Understanding Evaluation Dataset [1.8876415010297893]
We present the VALUE (Vision And Logical Understanding Evaluation) dataset, consisting of 200,000$+$ annotated images and an associated rule set.
The curated rule set considerably constrains the set of allowable predictions, and are designed to probe key semantic abilities.
We analyze several popular and state of the art vision models on this task, and show that, although their performance on standard metrics are laudable, they produce a plethora of incoherent results.
arXiv Detail & Related papers (2023-11-21T13:52:31Z) - Proposal-Level Unsupervised Domain Adaptation for Open World Unbiased
Detector [35.334125159092025]
We build an unbiased foreground predictor by re-formulating the task under Unsupervised Domain Adaptation.
We adopt the simple and effective self-training method to learn a predictor based on the domain-invariant foreground features.
Our approach's pipeline can adapt to various detection frameworks and UDA methods, empirically validated by OWOD evaluation.
arXiv Detail & Related papers (2023-11-04T07:46:45Z) - Activate and Reject: Towards Safe Domain Generalization under Category
Shift [71.95548187205736]
We study a practical problem of Domain Generalization under Category Shift (DGCS)
It aims to simultaneously detect unknown-class samples and classify known-class samples in the target domains.
Compared to prior DG works, we face two new challenges: 1) how to learn the concept of unknown'' during training with only source known-class samples, and 2) how to adapt the source-trained model to unseen environments.
arXiv Detail & Related papers (2023-10-07T07:53:12Z) - UDTIRI: An Online Open-Source Intelligent Road Inspection Benchmark
Suite [21.565438268381467]
We introduce the road pothole detection task, the first online competition published within this benchmark suite.
Our benchmark provides a systematic and thorough evaluation of state-of-the-art object detection, semantic segmentation, and instance segmentation networks.
By providing algorithms with a more comprehensive understanding of diverse road conditions, we seek to unlock their untapped potential.
arXiv Detail & Related papers (2023-04-18T09:13:52Z) - CoDEPS: Online Continual Learning for Depth Estimation and Panoptic
Segmentation [28.782231314289174]
We introduce continual learning for deep learning-based monocular depth estimation and panoptic segmentation in new environments in an online manner.
We propose a novel domain-mixing strategy to generate pseudo-labels to adapt panoptic segmentation.
We explicitly address the limited storage capacity of robotic systems by leveraging sampling strategies for constructing a fixed-size replay buffer.
arXiv Detail & Related papers (2023-03-17T17:31:55Z) - A Review of Single-Source Deep Unsupervised Visual Domain Adaptation [81.07994783143533]
Large-scale labeled training datasets have enabled deep neural networks to excel across a wide range of benchmark vision tasks.
In many applications, it is prohibitively expensive and time-consuming to obtain large quantities of labeled data.
To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled or unlabeled target domain.
arXiv Detail & Related papers (2020-09-01T00:06:50Z) - Cross-Domain Facial Expression Recognition: A Unified Evaluation
Benchmark and Adversarial Graph Learning [85.6386289476598]
We develop a novel adversarial graph representation adaptation (AGRA) framework for cross-domain holistic-local feature co-adaptation.
We conduct extensive and fair evaluations on several popular benchmarks and show that the proposed AGRA framework outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2020-08-03T15:00:31Z) - Anomaly Detection by One Class Latent Regularized Networks [36.67420338535258]
Semi-supervised Generative Adversarial Networks (GAN)-based methods have been gaining popularity in anomaly detection task recently.
A novel adversarial dual autoencoder network is proposed, in which the underlying structure of training data is captured in latent feature space.
Experiments show that our model achieves the state-of-the-art results on MNIST and CIFAR10 datasets as well as GTSRB stop signs dataset.
arXiv Detail & Related papers (2020-02-05T02:21:52Z) - Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal
Clustering and Large-Scale Heterogeneous Environment Synthesis [76.46004354572956]
We introduce an unsupervised domain adaptation approach for person re-identification.
Experimental results show that the proposed ktCUDA and SHRED approach achieves an average improvement of +5.7 mAP in re-identification performance.
arXiv Detail & Related papers (2020-01-14T17:43:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.