Unicom: Universal and Compact Representation Learning for Image
Retrieval
- URL: http://arxiv.org/abs/2304.05884v1
- Date: Wed, 12 Apr 2023 14:25:52 GMT
- Title: Unicom: Universal and Compact Representation Learning for Image
Retrieval
- Authors: Xiang An, Jiankang Deng, Kaicheng Yang, Jaiwei Li, Ziyong Feng, Jia
Guo, Jing Yang, Tongliang Liu
- Abstract summary: We cluster the large-scale LAION400M into one million pseudo classes based on the joint textual and visual features extracted by the CLIP model.
To alleviate such conflict, we randomly select partial inter-class prototypes to construct the margin-based softmax loss.
Our method significantly outperforms state-of-the-art unsupervised and supervised image retrieval approaches on multiple benchmarks.
- Score: 65.96296089560421
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern image retrieval methods typically rely on fine-tuning pre-trained
encoders to extract image-level descriptors. However, the most widely used
models are pre-trained on ImageNet-1K with limited classes. The pre-trained
feature representation is therefore not universal enough to generalize well to
the diverse open-world classes. In this paper, we first cluster the large-scale
LAION400M into one million pseudo classes based on the joint textual and visual
features extracted by the CLIP model. Due to the confusion of label
granularity, the automatically clustered dataset inevitably contains heavy
inter-class conflict. To alleviate such conflict, we randomly select partial
inter-class prototypes to construct the margin-based softmax loss. To further
enhance the low-dimensional feature representation, we randomly select partial
feature dimensions when calculating the similarities between embeddings and
class-wise prototypes. The dual random partial selections are with respect to
the class dimension and the feature dimension of the prototype matrix, making
the classification conflict-robust and the feature embedding compact. Our
method significantly outperforms state-of-the-art unsupervised and supervised
image retrieval approaches on multiple benchmarks. The code and pre-trained
models are released to facilitate future research
https://github.com/deepglint/unicom.
Related papers
- FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models [56.71672127740099]
We focus on the task of image segmentation, which is traditionally solved by training models on closed-vocabulary datasets.
We leverage different and relatively small-sized, open-source foundation models for zero-shot open-vocabulary segmentation.
Our approach (dubbed FreeSeg-Diff), which does not rely on any training, outperforms many training-based approaches on both Pascal VOC and COCO datasets.
arXiv Detail & Related papers (2024-03-29T10:38:25Z) - PDiscoNet: Semantically consistent part discovery for fine-grained
recognition [62.12602920807109]
We propose PDiscoNet to discover object parts by using only image-level class labels along with priors encouraging the parts to be.
Our results on CUB, CelebA, and PartImageNet show that the proposed method provides substantially better part discovery performance than previous methods.
arXiv Detail & Related papers (2023-09-06T17:19:29Z) - Rethinking Semantic Segmentation: A Prototype View [126.59244185849838]
We present a nonparametric semantic segmentation model based on non-learnable prototypes.
Our framework yields compelling results over several datasets.
We expect this work will provoke a rethink of the current de facto semantic segmentation model design.
arXiv Detail & Related papers (2022-03-28T21:15:32Z) - CAR: Class-aware Regularizations for Semantic Segmentation [20.947897583427192]
We propose a universal Class-Aware Regularization (CAR) approach to optimize the intra-class variance and inter-class distance during feature learning.
Our method can be easily applied to most existing segmentation models during training, including OCR and CPNet.
arXiv Detail & Related papers (2022-03-14T15:02:48Z) - Dual Prototypical Contrastive Learning for Few-shot Semantic
Segmentation [55.339405417090084]
We propose a dual prototypical contrastive learning approach tailored to the few-shot semantic segmentation (FSS) task.
The main idea is to encourage the prototypes more discriminative by increasing inter-class distance while reducing intra-class distance in prototype feature space.
We demonstrate that the proposed dual contrastive learning approach outperforms state-of-the-art FSS methods on PASCAL-5i and COCO-20i datasets.
arXiv Detail & Related papers (2021-11-09T08:14:50Z) - Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets.
This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets.
In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z) - Improving Few-shot Learning with Weakly-supervised Object Localization [24.3569501375842]
We propose a novel framework that generates class representations by extracting features from class-relevant regions of the images.
Our method outperforms the baseline few-shot model in miniImageNet and tieredImageNet benchmarks.
arXiv Detail & Related papers (2021-05-25T07:39:32Z) - One-Shot Image Classification by Learning to Restore Prototypes [11.448423413463916]
One-shot image classification aims to train image classifiers over the dataset with only one image per category.
For one-shot learning, the existing metric learning approaches would suffer poor performance because the single training image may not be representative of the class.
We propose a simple yet effective regression model, denoted by RestoreNet, which learns a class transformation on the image feature to move the image closer to the class center in the feature space.
arXiv Detail & Related papers (2020-05-04T02:11:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.