Google Landmarks Dataset v2 -- A Large-Scale Benchmark for
Instance-Level Recognition and Retrieval
- URL: http://arxiv.org/abs/2004.01804v2
- Date: Mon, 2 Nov 2020 18:30:45 GMT
- Title: Google Landmarks Dataset v2 -- A Large-Scale Benchmark for
Instance-Level Recognition and Retrieval
- Authors: Tobias Weyand, Andre Araujo, Bingyi Cao, Jack Sim
- Abstract summary: We introduce the Google Landmarks dataset v2 (GLDv2), a new benchmark for large-scale, fine-grained instance recognition and image retrieval.
GLDv2 is the largest such dataset to date by a large margin, including over 5M images and 200k distinct instance labels.
The dataset is sourced from Wikimedia Commons, the world's largest crowdsourced collection of landmark photos.
- Score: 9.922132565411664
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: While image retrieval and instance recognition techniques are progressing
rapidly, there is a need for challenging datasets to accurately measure their
performance -- while posing novel challenges that are relevant for practical
applications. We introduce the Google Landmarks Dataset v2 (GLDv2), a new
benchmark for large-scale, fine-grained instance recognition and image
retrieval in the domain of human-made and natural landmarks. GLDv2 is the
largest such dataset to date by a large margin, including over 5M images and
200k distinct instance labels. Its test set consists of 118k images with ground
truth annotations for both the retrieval and recognition tasks. The ground
truth construction involved over 800 hours of human annotator work. Our new
dataset has several challenging properties inspired by real world applications
that previous datasets did not consider: An extremely long-tailed class
distribution, a large fraction of out-of-domain test photos and large
intra-class variability. The dataset is sourced from Wikimedia Commons, the
world's largest crowdsourced collection of landmark photos. We provide baseline
results for both recognition and retrieval tasks based on state-of-the-art
methods as well as competitive results from a public challenge. We further
demonstrate the suitability of the dataset for transfer learning by showing
that image embeddings trained on it achieve competitive retrieval performance
on independent datasets. The dataset images, ground-truth and metric scoring
code are available at https://github.com/cvdfoundation/google-landmark.
Related papers
- FORB: A Flat Object Retrieval Benchmark for Universal Image Embedding [7.272083488859574]
We introduce a new dataset for benchmarking visual search methods on flat images with diverse patterns.
Our flat object retrieval benchmark (FORB) supplements the commonly adopted 3D object domain.
It serves as a testbed for assessing the image embedding quality on out-of-distribution domains.
arXiv Detail & Related papers (2023-09-28T08:41:51Z) - Are Local Features All You Need for Cross-Domain Visual Place
Recognition? [13.519413608607781]
Visual Place Recognition aims to predict the coordinates of an image based solely on visual clues.
Despite recent advances, recognizing the same place when the query comes from a significantly different distribution is still a major hurdle for state of the art retrieval methods.
In this work we explore whether re-ranking methods based on spatial verification can tackle these challenges.
arXiv Detail & Related papers (2023-04-12T14:46:57Z) - High-Quality Entity Segmentation [110.55724145851725]
CropFormer is designed to tackle the intractability of instance-level segmentation on high-resolution images.
It improves mask prediction by fusing high-res image crops that provide more fine-grained image details and the full image.
With CropFormer, we achieve a significant AP gain of $1.9$ on the challenging entity segmentation task.
arXiv Detail & Related papers (2022-11-10T18:58:22Z) - Scrape, Cut, Paste and Learn: Automated Dataset Generation Applied to
Parcel Logistics [58.720142291102135]
We present a fully automated pipeline to generate a synthetic dataset for instance segmentation in four steps.
We first scrape images for the objects of interest from popular image search engines.
We compare three different methods for image selection: Object-agnostic pre-processing, manual image selection and CNN-based image selection.
arXiv Detail & Related papers (2022-10-18T12:49:04Z) - The Met Dataset: Instance-level Recognition for Artworks [19.43143591288768]
This work introduces a dataset for large-scale instance-level recognition in the domain of artworks.
We rely on the open access collection of The Met museum to form a large training set of about 224k classes.
arXiv Detail & Related papers (2022-02-03T18:13:30Z) - Learning Co-segmentation by Segment Swapping for Retrieval and Discovery [67.6609943904996]
The goal of this work is to efficiently identify visually similar patterns from a pair of images.
We generate synthetic training pairs by selecting object segments in an image and copy-pasting them into another image.
We show our approach provides clear improvements for artwork details retrieval on the Brueghel dataset.
arXiv Detail & Related papers (2021-10-29T16:51:16Z) - Text-Based Person Search with Limited Data [66.26504077270356]
Text-based person search (TBPS) aims at retrieving a target person from an image gallery with a descriptive text query.
We present a framework with two novel components to handle the problems brought by limited data.
arXiv Detail & Related papers (2021-10-20T22:20:47Z) - A Strong Baseline for the VIPriors Data-Efficient Image Classification
Challenge [9.017660524497389]
We present a strong baseline for data-efficient image classification on the VIPriors challenge dataset.
Our baseline achieves 69.7% accuracy and outperforms 50% of submissions to the VIPriors 2021 challenge.
arXiv Detail & Related papers (2021-09-28T08:45:15Z) - Large-scale Unsupervised Semantic Segmentation [163.3568726730319]
We propose a new problem of large-scale unsupervised semantic segmentation (LUSS) with a newly created benchmark dataset to track the research progress.
Based on the ImageNet dataset, we propose the ImageNet-S dataset with 1.2 million training images and 40k high-quality semantic segmentation annotations for evaluation.
arXiv Detail & Related papers (2021-06-06T15:02:11Z) - FAIR1M: A Benchmark Dataset for Fine-grained Object Recognition in
High-Resolution Remote Sensing Imagery [21.9319970004788]
We propose a novel benchmark dataset with more than 1 million instances and more than 15,000 images for Fine-grAined object recognItion in high-Resolution remote sensing imagery.
All objects in the FAIR1M dataset are annotated with respect to 5 categories and 37 sub-categories by oriented bounding boxes.
arXiv Detail & Related papers (2021-03-09T17:20:15Z) - On Creating Benchmark Dataset for Aerial Image Interpretation: Reviews,
Guidances and Million-AID [57.71601467271486]
This article discusses the problem of how to efficiently prepare a suitable benchmark dataset for RS image interpretation.
We first analyze the current challenges of developing intelligent algorithms for RS image interpretation with bibliometric investigations.
Following the presented guidances, we also provide an example on building RS image dataset, i.e., Million-AID, a new large-scale benchmark dataset.
arXiv Detail & Related papers (2020-06-22T17:59:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.