Boosting Deep Open World Recognition by Clustering
- URL: http://arxiv.org/abs/2004.13849v2
- Date: Mon, 30 Nov 2020 09:35:37 GMT
- Title: Boosting Deep Open World Recognition by Clustering
- Authors: Dario Fontanel, Fabio Cermelli, Massimiliano Mancini, Samuel Rota
Bul\`o, Elisa Ricci, Barbara Caputo
- Abstract summary: We show how we can boost the performance of deep open world recognition algorithms by means of a new loss formulation.
We propose a strategy to learn class-specific rejection thresholds, instead of estimating a single global threshold.
Experiments on RGB-D Object and Core50 show the effectiveness of our approach.
- Score: 37.5993398894786
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While convolutional neural networks have brought significant advances in
robot vision, their ability is often limited to closed world scenarios, where
the number of semantic concepts to be recognized is determined by the available
training set. Since it is practically impossible to capture all possible
semantic concepts present in the real world in a single training set, we need
to break the closed world assumption, equipping our robot with the capability
to act in an open world. To provide such ability, a robot vision system should
be able to (i) identify whether an instance does not belong to the set of known
categories (i.e. open set recognition), and (ii) extend its knowledge to learn
new classes over time (i.e. incremental learning). In this work, we show how we
can boost the performance of deep open world recognition algorithms by means of
a new loss formulation enforcing a global to local clustering of class-specific
features. In particular, a first loss term, i.e. global clustering, forces the
network to map samples closer to the class centroid they belong to while the
second one, local clustering, shapes the representation space in such a way
that samples of the same class get closer in the representation space while
pushing away neighbours belonging to other classes. Moreover, we propose a
strategy to learn class-specific rejection thresholds, instead of heuristically
estimating a single global threshold, as in previous works. Experiments on
RGB-D Object and Core50 datasets show the effectiveness of our approach.
Related papers
- LoDisc: Learning Global-Local Discriminative Features for
Self-Supervised Fine-Grained Visual Recognition [18.442966979622717]
We present to incorporate the subtle local fine-grained feature learning into global self-supervised contrastive learning.
A novel pretext task called Local Discrimination (LoDisc) is proposed to explicitly supervise self-supervised model's focus towards local pivotal regions.
We show that Local Discrimination pretext task can effectively enhance fine-grained clues in important local regions, and the global-local framework further refines the fine-grained feature representations of images.
arXiv Detail & Related papers (2024-03-06T21:36:38Z) - Generalized Label-Efficient 3D Scene Parsing via Hierarchical Feature
Aligned Pre-Training and Region-Aware Fine-tuning [55.517000360348725]
This work presents a framework for dealing with 3D scene understanding when the labeled scenes are quite limited.
To extract knowledge for novel categories from the pre-trained vision-language models, we propose a hierarchical feature-aligned pre-training and knowledge distillation strategy.
Experiments with both indoor and outdoor scenes demonstrated the effectiveness of our approach in both data-efficient learning and open-world few-shot learning.
arXiv Detail & Related papers (2023-12-01T15:47:04Z) - Learning to Discover and Detect Objects [43.52208526783969]
We tackle the problem of novel class discovery, detection, and localization (NCDL)
In this setting, we assume a source dataset with labels for objects of commonly observed classes.
By training our detection network with this objective in an end-to-end manner, it learns to classify all region proposals for a large variety of classes.
arXiv Detail & Related papers (2022-10-19T17:59:55Z) - Open Long-Tailed Recognition in a Dynamic World [82.91025831618545]
Real world data often exhibits a long-tailed and open-ended (with unseen classes) distribution.
A practical recognition system must balance between majority (head) and minority (tail) classes, generalize across the distribution, and acknowledge novelty upon the instances of unseen classes (open classes)
We define Open Long-Tailed Recognition++ as learning from such naturally distributed data and optimizing for the classification accuracy over a balanced test set.
arXiv Detail & Related papers (2022-08-17T15:22:20Z) - Deep face recognition with clustering based domain adaptation [57.29464116557734]
We propose a new clustering-based domain adaptation method designed for face recognition task in which the source and target domain do not share any classes.
Our method effectively learns the discriminative target feature by aligning the feature domain globally, and, at the meantime, distinguishing the target clusters locally.
arXiv Detail & Related papers (2022-05-27T12:29:11Z) - Contrastive Learning for Cross-Domain Open World Recognition [17.660958043781154]
The ability to evolve is fundamental for any valuable autonomous agent whose knowledge cannot remain limited to that injected by the manufacturer.
We show how it learns a feature space perfectly suitable to incrementally include new classes and is able to capture knowledge which generalizes across a variety of visual domains.
Our method is endowed with a tailored effective stopping criterion for each learning episode and exploits a novel self-paced thresholding strategy.
arXiv Detail & Related papers (2022-03-17T11:23:53Z) - Bayesian Embeddings for Few-Shot Open World Recognition [60.39866770427436]
We extend embedding-based few-shot learning algorithms to the open-world recognition setting.
We benchmark our framework on open-world extensions of the common MiniImageNet and TieredImageNet few-shot learning datasets.
arXiv Detail & Related papers (2021-07-29T00:38:47Z) - CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action
Recognition [52.66360172784038]
We propose a clustering-based model, which considers all training samples at once, instead of optimizing for each instance individually.
We call the proposed method CLASTER and observe that it consistently improves over the state-of-the-art in all standard datasets.
arXiv Detail & Related papers (2021-01-18T12:46:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.