Introspective Deep Metric Learning
- URL: http://arxiv.org/abs/2309.09982v1
- Date: Mon, 11 Sep 2023 16:21:13 GMT
- Title: Introspective Deep Metric Learning
- Authors: Chengkun Wang, Wenzhao Zheng, Zheng Zhu, Jie Zhou, Jiwen Lu
- Abstract summary: We propose an introspective deep metric learning framework for uncertainty-aware comparisons of images.
The proposed IDML framework improves the performance of deep metric learning through uncertainty modeling.
- Score: 91.47907685364036
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes an introspective deep metric learning (IDML) framework
for uncertainty-aware comparisons of images. Conventional deep metric learning
methods focus on learning a discriminative embedding to describe the semantic
features of images, which ignore the existence of uncertainty in each image
resulting from noise or semantic ambiguity. Training without awareness of these
uncertainties causes the model to overfit the annotated labels during training
and produce unsatisfactory judgments during inference. Motivated by this, we
argue that a good similarity model should consider the semantic discrepancies
with awareness of the uncertainty to better deal with ambiguous images for more
robust training. To achieve this, we propose to represent an image using not
only a semantic embedding but also an accompanying uncertainty embedding, which
describes the semantic characteristics and ambiguity of an image, respectively.
We further propose an introspective similarity metric to make similarity
judgments between images considering both their semantic differences and
ambiguities. The gradient analysis of the proposed metric shows that it enables
the model to learn at an adaptive and slower pace to deal with the uncertainty
during training. The proposed IDML framework improves the performance of deep
metric learning through uncertainty modeling and attains state-of-the-art
results on the widely used CUB-200-2011, Cars196, and Stanford Online Products
datasets for image retrieval and clustering. We further provide an in-depth
analysis of our framework to demonstrate the effectiveness and reliability of
IDML. Code: https://github.com/wzzheng/IDML.
Related papers
- Annotation Cost-Efficient Active Learning for Deep Metric Learning Driven Remote Sensing Image Retrieval [3.2109665109975696]
ANNEAL aims to create a small but informative training set made up of similar and dissimilar image pairs.
The informativeness of image pairs is evaluated by combining uncertainty and diversity criteria.
This way of annotating images significantly reduces the annotation cost compared to annotating images with land-use land-cover class labels.
arXiv Detail & Related papers (2024-06-14T15:08:04Z) - Hyp-UML: Hyperbolic Image Retrieval with Uncertainty-aware Metric
Learning [8.012146883983227]
Metric learning plays a critical role in training image retrieval and classification.
Hyperbolic embedding can be more effective in representing the hierarchical data structure.
We propose two types of uncertainty-aware metric learning, for the popular Contrastive learning and conventional margin-based metric learning.
arXiv Detail & Related papers (2023-10-12T15:00:06Z) - Hierarchical Uncertainty Estimation for Medical Image Segmentation
Networks [1.9564356751775307]
Uncertainty exists in both images (noise) and manual annotations (human errors and bias) used for model training.
We propose a simple yet effective method for estimating uncertainties at multiple levels.
We demonstrate that a deep learning segmentation network such as U-net, can achieve a high segmentation performance.
arXiv Detail & Related papers (2023-08-16T16:09:23Z) - Introspective Deep Metric Learning for Image Retrieval [80.29866561553483]
We argue that a good similarity model should consider the semantic discrepancies with caution to better deal with ambiguous images for more robust training.
We propose to represent an image using not only a semantic embedding but also an accompanying uncertainty embedding, which describes the semantic characteristics and ambiguity of an image, respectively.
The proposed IDML framework improves the performance of deep metric learning through uncertainty modeling and attains state-of-the-art results on the widely used CUB-200-2011, Cars196, and Stanford Online Products datasets.
arXiv Detail & Related papers (2022-05-09T17:51:44Z) - Dense Contrastive Visual-Linguistic Pretraining [53.61233531733243]
Several multimodal representation learning approaches have been proposed that jointly represent image and text.
These approaches achieve superior performance by capturing high-level semantic information from large-scale multimodal pretraining.
We propose unbiased Dense Contrastive Visual-Linguistic Pretraining to replace the region regression and classification with cross-modality region contrastive learning.
arXiv Detail & Related papers (2021-09-24T07:20:13Z) - Learning Contrastive Representation for Semantic Correspondence [150.29135856909477]
We propose a multi-level contrastive learning approach for semantic matching.
We show that image-level contrastive learning is a key component to encourage the convolutional features to find correspondence between similar objects.
arXiv Detail & Related papers (2021-09-22T18:34:14Z) - DeepSim: Semantic similarity metrics for learned image registration [6.789370732159177]
We propose a semantic similarity metric for image registration.
Our approach learns dataset-specific features that drive the optimization of a learning-based registration model.
arXiv Detail & Related papers (2020-11-11T12:35:07Z) - Uncertainty-Aware Few-Shot Image Classification [118.72423376789062]
Few-shot image classification learns to recognize new categories from limited labelled data.
We propose Uncertainty-Aware Few-Shot framework for image classification.
arXiv Detail & Related papers (2020-10-09T12:26:27Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.