AiRound and CV-BrCT: Novel Multi-View Datasets for Scene Classification
- URL: http://arxiv.org/abs/2008.01133v1
- Date: Mon, 3 Aug 2020 18:55:46 GMT
- Title: AiRound and CV-BrCT: Novel Multi-View Datasets for Scene Classification
- Authors: Gabriel Machado, Edemir Ferreira, Keiller Nogueira, Hugo Oliveira,
Pedro Gama and Jefersson A. dos Santos
- Abstract summary: We present two new publicly available datasets named thedatasetand CV-BrCT.
The first one contains triplets of images from the same geographic coordinate with different perspectives of view extracted from various places around the world.
The second dataset contains pairs of aerial and street-level images extracted from southeast Brazil.
- Score: 2.931113769364182
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is undeniable that aerial/satellite images can provide useful information
for a large variety of tasks. But, since these images are always looking from
above, some applications can benefit from complementary information provided by
other perspective views of the scene, such as ground-level images. Despite a
large number of public repositories for both georeferenced photographs and
aerial images, there is a lack of benchmark datasets that allow the development
of approaches that exploit the benefits and complementarity of aerial/ground
imagery. In this paper, we present two new publicly available datasets named
\thedataset~and CV-BrCT. The first one contains triplets of images from the
same geographic coordinate with different perspectives of view extracted from
various places around the world. Each triplet is composed of an aerial RGB
image, a ground-level perspective image, and a Sentinel-2 sample. The second
dataset contains pairs of aerial and street-level images extracted from
southeast Brazil. We design an extensive set of experiments concerning
multi-view scene classification, using early and late fusion. Such experiments
were conducted to show that image classification can be enhanced using
multi-view data.
Related papers
- Game4Loc: A UAV Geo-Localization Benchmark from Game Data [0.0]
We introduce a more practical UAV geo-localization task including partial matches of cross-view paired data.
Experiments demonstrate the effectiveness of our data and training method for UAV geo-localization.
arXiv Detail & Related papers (2024-09-25T13:33:28Z) - Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network [12.692812966686066]
Cross-view geolocalization identifies the geographic location of street view images by matching them with a georeferenced satellite database.
We propose a new approach for cross-view image geo-localization, i.e., the Panorama-BEV Co-Retrieval Network.
arXiv Detail & Related papers (2024-08-10T08:03:58Z) - Classifying geospatial objects from multiview aerial imagery using semantic meshes [2.116528763953217]
We propose a new method to predict tree species based on aerial images of forests in the U.S.
We show that our proposed multiview method improves classification accuracy from 53% to 75% relative to an orthoorthoaic baseline on a challenging cross-site tree classification task.
arXiv Detail & Related papers (2024-05-15T17:56:49Z) - AG-ReID.v2: Bridging Aerial and Ground Views for Person Re-identification [39.58286453178339]
Aerial-ground person re-identification (Re-ID) presents unique challenges in computer vision.
We introduce AG-ReID.v2, a dataset specifically designed for person Re-ID in mixed aerial and ground scenarios.
This dataset comprises 100,502 images of 1,615 unique individuals, each annotated with matching IDs and 15 soft attribute labels.
arXiv Detail & Related papers (2024-01-05T04:53:33Z) - Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve
Aerial Visual Perception? [57.77643186237265]
We present Multiview Aerial Visual RECognition or MAVREC, a video dataset where we record synchronized scenes from different perspectives.
MAVREC consists of around 2.5 hours of industry-standard 2.7K resolution video sequences, more than 0.5 million frames, and 1.1 million annotated bounding boxes.
This makes MAVREC the largest ground and aerial-view dataset, and the fourth largest among all drone-based datasets.
arXiv Detail & Related papers (2023-12-07T18:59:14Z) - CSP: Self-Supervised Contrastive Spatial Pre-Training for
Geospatial-Visual Representations [90.50864830038202]
We present Contrastive Spatial Pre-Training (CSP), a self-supervised learning framework for geo-tagged images.
We use a dual-encoder to separately encode the images and their corresponding geo-locations, and use contrastive objectives to learn effective location representations from images.
CSP significantly boosts the model performance with 10-34% relative improvement with various labeled training data sampling ratios.
arXiv Detail & Related papers (2023-05-01T23:11:18Z) - Where We Are and What We're Looking At: Query Based Worldwide Image
Geo-localization Using Hierarchies and Scenes [53.53712888703834]
We introduce an end-to-end transformer-based architecture that exploits the relationship between different geographic levels.
We achieve state of the art street level accuracy on 4 standard geo-localization datasets.
arXiv Detail & Related papers (2023-03-07T21:47:58Z) - GAMa: Cross-view Video Geo-localization [68.33955764543465]
We focus on ground videos instead of images which provides contextual cues.
At clip-level, a short video clip is matched with corresponding aerial image and is later used to get video-level geo-localization of a long video.
Our proposed method achieves a Top-1 recall rate of 19.4% and 45.1% @1.0mile.
arXiv Detail & Related papers (2022-07-06T04:25:51Z) - Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical
Understanding of Outdoor Scene [76.4183572058063]
We present a richly-annotated 3D point cloud dataset for multiple outdoor scene understanding tasks.
The dataset has been point-wisely annotated with both hierarchical and instance-based labels.
We formulate a hierarchical learning problem for 3D point cloud segmentation and propose a measurement evaluating consistency across various hierarchies.
arXiv Detail & Related papers (2020-08-11T19:10:32Z) - Multiview Detection with Feature Perspective Transformation [59.34619548026885]
We propose a novel multiview detection system, MVDet.
We take an anchor-free approach to aggregate multiview information by projecting feature maps onto the ground plane.
Our entire model is end-to-end learnable and achieves 88.2% MODA on the standard Wildtrack dataset.
arXiv Detail & Related papers (2020-07-14T17:58:30Z) - Cross-View Image Retrieval -- Ground to Aerial Image Retrieval through
Deep Learning [3.326320568999945]
We present a novel cross-modal retrieval method specifically for multi-view images, called Cross-view Image Retrieval CVIR.
Our approach aims to find a feature space as well as an embedding space in which samples from street-view images are compared directly to satellite-view images.
For this comparison, a novel deep metric learning based solution "DeepCVIR" has been proposed.
arXiv Detail & Related papers (2020-05-02T06:52:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.