Related papers: Improving Deep Metric Learning by Divide and Conquer

Improving Deep Metric Learning by Divide and Conquer

URL: http://arxiv.org/abs/2109.04003v1
Date: Thu, 9 Sep 2021 02:57:34 GMT
Title: Improving Deep Metric Learning by Divide and Conquer
Authors: Artsiom Sanakoyeu, Pingchuan Ma, Vadim Tschernezki, Bj\"orn Ommer
Abstract summary: Deep metric learning (DML) is a cornerstone of many computer vision applications. It aims at learning a mapping from the input domain to an embedding space, where semantically similar objects are located nearby and dissimilar objects far from another. We propose to build a more expressive representation by splitting the embedding space and the data hierarchically into smaller sub-parts.
Score: 11.380358587116683
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Deep metric learning (DML) is a cornerstone of many computer vision applications. It aims at learning a mapping from the input domain to an embedding space, where semantically similar objects are located nearby and dissimilar objects far from another. The target similarity on the training data is defined by user in form of ground-truth class labels. However, while the embedding space learns to mimic the user-provided similarity on the training data, it should also generalize to novel categories not seen during training. Besides user-provided groundtruth training labels, a lot of additional visual factors (such as viewpoint changes or shape peculiarities) exist and imply different notions of similarity between objects, affecting the generalization on the images unseen during training. However, existing approaches usually directly learn a single embedding space on all available training data, struggling to encode all different types of relationships, and do not generalize well. We propose to build a more expressive representation by jointly splitting the embedding space and the data hierarchically into smaller sub-parts. We successively focus on smaller subsets of the training data, reducing its variance and learning a different embedding subspace for each data subset. Moreover, the subspaces are learned jointly to cover not only the intricacies, but the breadth of the data as well. Only after that, we build the final embedding from the subspaces in the conquering stage. The proposed algorithm acts as a transparent wrapper that can be placed around arbitrary existing DML methods. Our approach significantly improves upon the state-of-the-art on image retrieval, clustering, and re-identification tasks evaluated using CUB200-2011, CARS196, Stanford Online Products, In-shop Clothes, and PKU VehicleID datasets.

Related papers

Mitigating Knowledge Discrepancies among Multiple Datasets for Task-agnostic Unified Face Alignment [30.501432077729245]
Existing face alignment methods cannot learn unified knowledge from multiple datasets with different landmark annotations.<n>This paper presents a strategy to unify knowledge from multiple datasets.<n>Experiments are carried on seven benchmarks and the results demonstrate an impressive improvement in face alignment brought by knowledge discrepancies mitigation.
arXiv Detail & Related papers (2025-03-28T11:59:27Z)
Dual Consolidation for Pre-Trained Model-Based Domain-Incremental Learning [64.1745161657794]
Domain-Incremental Learning (DIL) involves the progressive adaptation of a model to new concepts across different domains. Recent advances in pre-trained models provide a solid foundation for DIL. However, learning new concepts often results in the catastrophic forgetting of pre-trained knowledge. We propose DUal ConsolidaTion (Duct) to unify and consolidate historical knowledge.
arXiv Detail & Related papers (2024-10-01T17:58:06Z)
From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding [50.412121156940294]
Action understanding can be formed as the mapping from the physical space to the semantic space. We propose a novel model mapping from the physical space to semantic space to fully use Pangea.
arXiv Detail & Related papers (2023-04-02T15:04:43Z)
DC-Former: Diverse and Compact Transformer for Person Re-Identification [38.12558570608426]
In person re-identification (re-ID) task, it is still challenging to learn discriminative representation by deep learning, due to limited data. We propose a Diverse and Compact Transformer (DC-Former) that can achieve a similar effect by splitting embedding space into multiple subspaces.
arXiv Detail & Related papers (2023-02-28T06:03:42Z)
Self-Supervised Representation Learning With MUlti-Segmental Informational Coding (MUSIC) [6.693379403133435]
Self-supervised representation learning maps high-dimensional data into a meaningful embedding space. We propose MUlti-Segmental Informational Coding (MUSIC) for self-supervised representation learning.
arXiv Detail & Related papers (2022-06-13T20:37:48Z)
Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations [78.12377360145078]
Contrastive self-supervised learning has outperformed supervised pretraining on many downstream tasks like segmentation and object detection. In this paper, we first study how biases in the dataset affect existing methods. We show that current contrastive approaches work surprisingly well across: (i) object- versus scene-centric, (ii) uniform versus long-tailed and (iii) general versus domain-specific datasets.
arXiv Detail & Related papers (2021-06-10T17:59:13Z)
Exploiting Shared Representations for Personalized Federated Learning [54.65133770989836]
We propose a novel federated learning framework and algorithm for learning a shared data representation across clients and unique local heads for each client. Our algorithm harnesses the distributed computational power across clients to perform many local-updates with respect to the low-dimensional local parameters for every update of the representation. This result is of interest beyond federated learning to a broad class of problems in which we aim to learn a shared low-dimensional representation among data distributions.
arXiv Detail & Related papers (2021-02-14T05:36:25Z)
Flexible deep transfer learning by separate feature embeddings and manifold alignment [0.0]
Object recognition is a key enabler across industry and defense. Unfortunately, algorithms trained on existing labeled datasets do not directly generalize to new data because the data distributions do not match. We propose a novel deep learning framework that overcomes this limitation by learning separate feature extractions for each domain.
arXiv Detail & Related papers (2020-12-22T19:24:44Z)
DomainMix: Learning Generalizable Person Re-Identification Without Human Annotations [89.78473564527688]
This paper shows how to use labeled synthetic dataset and unlabeled real-world dataset to train a universal model. In this way, human annotations are no longer required, and it is scalable to large and diverse real-world datasets. Experimental results show that the proposed annotation-free method is more or less comparable to the counterpart trained with full human annotations.
arXiv Detail & Related papers (2020-11-24T08:15:53Z)
i-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning [117.63815437385321]
We propose i-Mix, a simple yet effective domain-agnostic regularization strategy for improving contrastive representation learning. In experiments, we demonstrate that i-Mix consistently improves the quality of learned representations across domains.
arXiv Detail & Related papers (2020-10-17T23:32:26Z)
Towards Universal Representation Learning for Deep Face Recognition [106.21744671876704]
We propose a universal representation learning framework that can deal with larger variation unseen in the given training data without leveraging target domain knowledge. Experiments show that our method achieves top performance on general face recognition datasets such as LFW and MegaFace.
arXiv Detail & Related papers (2020-02-26T23:29:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.