Evaluating Contrastive Models for Instance-based Image Retrieval
- URL: http://arxiv.org/abs/2104.14939v1
- Date: Fri, 30 Apr 2021 12:05:23 GMT
- Title: Evaluating Contrastive Models for Instance-based Image Retrieval
- Authors: Tarun Krishna, Kevin McGuinness and Noel O'Connor
- Abstract summary: We evaluate contrastive models for the task of image retrieval.
We find that models trained using contrastive methods perform on-par with (and outperforms) a pre-trained baseline trained on the ImageNet labels.
- Score: 6.393147386784114
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we evaluate contrastive models for the task of image retrieval.
We hypothesise that models that are learned to encode semantic similarity among
instances via discriminative learning should perform well on the task of image
retrieval, where relevancy is defined in terms of instances of the same object.
Through our extensive evaluation, we find that representations from models
trained using contrastive methods perform on-par with (and outperforms) a
pre-trained supervised baseline trained on the ImageNet labels in retrieval
tasks under various configurations. This is remarkable given that the
contrastive models require no explicit supervision. Thus, we conclude that
these models can be used to bootstrap base models to build more robust image
retrieval engines.
Related papers
- Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Adapting Dual-encoder Vision-language Models for Paraphrased Retrieval [55.90407811819347]
We consider the task of paraphrased text-to-image retrieval where a model aims to return similar results given a pair of paraphrased queries.
We train a dual-encoder model starting from a language model pretrained on a large text corpus.
Compared to public dual-encoder models such as CLIP and OpenCLIP, the model trained with our best adaptation strategy achieves a significantly higher ranking similarity for paraphrased queries.
arXiv Detail & Related papers (2024-05-06T06:30:17Z) - Measuring Style Similarity in Diffusion Models [118.22433042873136]
We present a framework for understanding and extracting style descriptors from images.
Our framework comprises a new dataset curated using the insight that style is a subjective property of an image.
We also propose a method to extract style attribute descriptors that can be used to style of a generated image to the images used in the training dataset of a text-to-image model.
arXiv Detail & Related papers (2024-04-01T17:58:30Z) - Image Similarity using An Ensemble of Context-Sensitive Models [2.9490616593440317]
We present a more intuitive approach to build and compare image similarity models based on labelled data.
We address the challenges of sparse sampling in the image space (R, A, B) and biases in the models trained with context-based data.
Our testing results show that the ensemble model constructed performs 5% better than the best individual context-sensitive models.
arXiv Detail & Related papers (2024-01-15T20:23:05Z) - Has Your Pretrained Model Improved? A Multi-head Posterior Based
Approach [25.927323251675386]
We leverage the meta-features associated with each entity as a source of worldly knowledge and employ entity representations from the models.
We propose using the consistency between these representations and the meta-features as a metric for evaluating pre-trained models.
Our method's effectiveness is demonstrated across various domains, including models with relational datasets, large language models and image models.
arXiv Detail & Related papers (2024-01-02T17:08:26Z) - CorrEmbed: Evaluating Pre-trained Model Image Similarity Efficacy with a
Novel Metric [6.904776368895614]
We evaluate the viability of the image embeddings from pre-trained computer vision models using a novel approach named CorrEmbed.
Our approach computes the correlation between distances in image embeddings and distances in human-generated tag vectors.
Our method also identifies deviations from this pattern, providing insights into how different models capture high-level image features.
arXiv Detail & Related papers (2023-08-30T16:23:07Z) - Evaluating Data Attribution for Text-to-Image Models [62.844382063780365]
We evaluate attribution through "customization" methods, which tune an existing large-scale model toward a given exemplar object or style.
Our key insight is that this allows us to efficiently create synthetic images that are computationally influenced by the exemplar by construction.
By taking into account the inherent uncertainty of the problem, we can assign soft attribution scores over a set of training images.
arXiv Detail & Related papers (2023-06-15T17:59:51Z) - Meta Internal Learning [88.68276505511922]
Internal learning for single-image generation is a framework, where a generator is trained to produce novel images based on a single image.
We propose a meta-learning approach that enables training over a collection of images, in order to model the internal statistics of the sample image more effectively.
Our results show that the models obtained are as suitable as single-image GANs for many common image applications.
arXiv Detail & Related papers (2021-10-06T16:27:38Z) - Learning Contrastive Representation for Semantic Correspondence [150.29135856909477]
We propose a multi-level contrastive learning approach for semantic matching.
We show that image-level contrastive learning is a key component to encourage the convolutional features to find correspondence between similar objects.
arXiv Detail & Related papers (2021-09-22T18:34:14Z) - An application of a pseudo-parabolic modeling to texture image
recognition [0.0]
We present a novel methodology for texture image recognition using a partial differential equation modeling.
We employ the pseudo-parabolic Buckley-Leverett equation to provide a dynamics to the digital image representation and collect local descriptors from those images evolving in time.
arXiv Detail & Related papers (2021-02-09T18:08:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.