On Creating Benchmark Dataset for Aerial Image Interpretation: Reviews,
Guidances and Million-AID
- URL: http://arxiv.org/abs/2006.12485v2
- Date: Tue, 30 Mar 2021 10:53:03 GMT
- Title: On Creating Benchmark Dataset for Aerial Image Interpretation: Reviews,
Guidances and Million-AID
- Authors: Yang Long, Gui-Song Xia, Shengyang Li, Wen Yang, Michael Ying Yang,
Xiao Xiang Zhu, Liangpei Zhang, Deren Li
- Abstract summary: This article discusses the problem of how to efficiently prepare a suitable benchmark dataset for RS image interpretation.
We first analyze the current challenges of developing intelligent algorithms for RS image interpretation with bibliometric investigations.
Following the presented guidances, we also provide an example on building RS image dataset, i.e., Million-AID, a new large-scale benchmark dataset.
- Score: 57.71601467271486
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The past years have witnessed great progress on remote sensing (RS) image
interpretation and its wide applications. With RS images becoming more
accessible than ever before, there is an increasing demand for the automatic
interpretation of these images. In this context, the benchmark datasets serve
as essential prerequisites for developing and testing intelligent
interpretation algorithms. After reviewing existing benchmark datasets in the
research community of RS image interpretation, this article discusses the
problem of how to efficiently prepare a suitable benchmark dataset for RS image
interpretation. Specifically, we first analyze the current challenges of
developing intelligent algorithms for RS image interpretation with bibliometric
investigations. We then present the general guidances on creating benchmark
datasets in efficient manners. Following the presented guidances, we also
provide an example on building RS image dataset, i.e., Million-AID, a new
large-scale benchmark dataset containing a million instances for RS image scene
classification. Several challenges and perspectives in RS image annotation are
finally discussed to facilitate the research in benchmark dataset construction.
We do hope this paper will provide the RS community an overall perspective on
constructing large-scale and practical image datasets for further research,
especially data-driven ones.
Related papers
- Rethinking Image Super-Resolution from Training Data Perspectives [54.28824316574355]
We investigate the understudied effect of the training data used for image super-resolution (SR)
With this, we propose an automated image evaluation pipeline.
We find that datasets with (i) low compression artifacts, (ii) high within-image diversity as judged by the number of different objects, and (iii) a large number of images from ImageNet or PASS all positively affect SR performance.
arXiv Detail & Related papers (2024-09-01T16:25:04Z) - RSGPT: A Remote Sensing Vision Language Model and Benchmark [7.279747655485913]
We build a high-quality Remote Sensing Image Captioning dataset (RSICap)
This dataset comprises 2,585 human-annotated captions with rich and high-quality information.
We also provide a benchmark evaluation dataset called RSIEval.
arXiv Detail & Related papers (2023-07-28T02:23:35Z) - JourneyDB: A Benchmark for Generative Image Understanding [89.02046606392382]
We introduce a comprehensive dataset, referred to as JourneyDB, that caters to the domain of generative images.
Our meticulously curated dataset comprises 4 million distinct and high-quality generated images.
On our dataset, we have devised four benchmarks to assess the performance of generated image comprehension.
arXiv Detail & Related papers (2023-07-03T02:39:08Z) - RRSIS: Referring Remote Sensing Image Segmentation [25.538406069768662]
Localizing desired objects from remote sensing images is of great use in practical applications.
Referring image segmentation, which aims at segmenting out the objects to which a given expression refers, has been extensively studied in natural images.
We introduce referring remote sensing image segmentation (RRSIS) to fill in this gap and make some insightful explorations.
arXiv Detail & Related papers (2023-06-14T16:40:19Z) - Zero-shot Composed Text-Image Retrieval [72.43790281036584]
We consider the problem of composed image retrieval (CIR)
It aims to train a model that can fuse multi-modal information, e.g., text and images, to accurately retrieve images that match the query, extending the user's expression ability.
arXiv Detail & Related papers (2023-06-12T17:56:01Z) - Data Roaming and Quality Assessment for Composed Image Retrieval [25.452015862927766]
Composed Image Retrieval (CoIR) involves queries that combine image and text modalities, allowing users to express their intent more effectively.
We introduce the Large Scale Composed Image Retrieval (LaSCo) dataset, a new CoIR dataset which is ten times larger than existing ones.
We also introduce a new CoIR baseline, the Cross-Attention driven Shift (CASE)
arXiv Detail & Related papers (2023-03-16T16:02:24Z) - Large-scale Unsupervised Semantic Segmentation [163.3568726730319]
We propose a new problem of large-scale unsupervised semantic segmentation (LUSS) with a newly created benchmark dataset to track the research progress.
Based on the ImageNet dataset, we propose the ImageNet-S dataset with 1.2 million training images and 40k high-quality semantic segmentation annotations for evaluation.
arXiv Detail & Related papers (2021-06-06T15:02:11Z) - Using Text to Teach Image Retrieval [47.72498265721957]
We build on the concept of image manifold to represent the feature space of images, learned via neural networks, as a graph.
We augment the manifold samples with geometrically aligned text, thereby using a plethora of sentences to teach us about images.
The experimental results show that the joint embedding manifold is a robust representation, allowing it to be a better basis to perform image retrieval.
arXiv Detail & Related papers (2020-11-19T16:09:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.