The Met Dataset: Instance-level Recognition for Artworks
- URL: http://arxiv.org/abs/2202.01747v1
- Date: Thu, 3 Feb 2022 18:13:30 GMT
- Title: The Met Dataset: Instance-level Recognition for Artworks
- Authors: Nikolaos-Antonios Ypsilantis, Noa Garcia, Guangxing Han, Sarah
Ibrahimi, Nanne Van Noord, Giorgos Tolias
- Abstract summary: This work introduces a dataset for large-scale instance-level recognition in the domain of artworks.
We rely on the open access collection of The Met museum to form a large training set of about 224k classes.
- Score: 19.43143591288768
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work introduces a dataset for large-scale instance-level recognition in
the domain of artworks. The proposed benchmark exhibits a number of different
challenges such as large inter-class similarity, long tail distribution, and
many classes. We rely on the open access collection of The Met museum to form a
large training set of about 224k classes, where each class corresponds to a
museum exhibit with photos taken under studio conditions. Testing is primarily
performed on photos taken by museum guests depicting exhibits, which introduces
a distribution shift between training and testing. Testing is additionally
performed on a set of images not related to Met exhibits making the task
resemble an out-of-distribution detection problem. The proposed benchmark
follows the paradigm of other recent datasets for instance-level recognition on
different domains to encourage research on domain independent approaches. A
number of suitable approaches are evaluated to offer a testbed for future
comparisons. Self-supervised and supervised contrastive learning are
effectively combined to train the backbone which is used for non-parametric
classification that is shown as a promising direction. Dataset webpage:
http://cmp.felk.cvut.cz/met/
Related papers
- Masked Image Modeling: A Survey [73.21154550957898]
Masked image modeling emerged as a powerful self-supervised learning technique in computer vision.
We construct a taxonomy and review the most prominent papers in recent years.
We aggregate the performance results of various masked image modeling methods on the most popular datasets.
arXiv Detail & Related papers (2024-08-13T07:27:02Z) - Multi-level Relation Learning for Cross-domain Few-shot Hyperspectral
Image Classification [8.78907921615878]
Cross-domain few-shot hyperspectral image classification focuses on learning prior knowledge from a large number of labeled samples from source domains.
This paper proposes to learn sample relations on different levels and take them into the model learning process.
arXiv Detail & Related papers (2023-11-02T13:06:03Z) - A Semi-Paired Approach For Label-to-Image Translation [6.888253564585197]
We introduce the first semi-supervised (semi-paired) framework for label-to-image translation.
In the semi-paired setting, the model has access to a small set of paired data and a larger set of unpaired images and labels.
We propose a training algorithm for this shared network, and we present a rare classes sampling algorithm to focus on under-represented classes.
arXiv Detail & Related papers (2023-06-23T16:13:43Z) - Neural Congealing: Aligning Images to a Joint Semantic Atlas [14.348512536556413]
We present a zero-shot self-supervised framework for aligning semantically-common content across a set of images.
Our approach harnesses the power of pre-trained DINO-ViT features to learn.
We show that our method performs favorably compared to a state-of-the-art method that requires extensive training on large-scale datasets.
arXiv Detail & Related papers (2023-02-08T09:26:22Z) - LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of
Feature Similarity [49.84167231111667]
Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image.
We introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion.
We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations.
arXiv Detail & Related papers (2022-04-06T17:48:18Z) - Learning Contrastive Representation for Semantic Correspondence [150.29135856909477]
We propose a multi-level contrastive learning approach for semantic matching.
We show that image-level contrastive learning is a key component to encourage the convolutional features to find correspondence between similar objects.
arXiv Detail & Related papers (2021-09-22T18:34:14Z) - Free Lunch for Co-Saliency Detection: Context Adjustment [14.688461235328306]
We propose a "cost-free" group-cut-paste (GCP) procedure to leverage images from off-the-shelf saliency detection datasets and synthesize new samples.
We collect a novel dataset called Context Adjustment Training. The two variants of our dataset, i.e., CAT and CAT+, consist of 16,750 and 33,500 images, respectively.
arXiv Detail & Related papers (2021-08-04T14:51:37Z) - Rectifying the Shortcut Learning of Background: Shared Object
Concentration for Few-Shot Image Recognition [101.59989523028264]
Few-Shot image classification aims to utilize pretrained knowledge learned from a large-scale dataset to tackle a series of downstream classification tasks.
We propose COSOC, a novel Few-Shot Learning framework, to automatically figure out foreground objects at both pretraining and evaluation stage.
arXiv Detail & Related papers (2021-07-16T07:46:41Z) - Instance Localization for Self-supervised Detection Pretraining [68.24102560821623]
We propose a new self-supervised pretext task, called instance localization.
We show that integration of bounding boxes into pretraining promotes better task alignment and architecture alignment for transfer learning.
Experimental results demonstrate that our approach yields state-of-the-art transfer learning results for object detection.
arXiv Detail & Related papers (2021-02-16T17:58:57Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.