Enhancing Ground-to-Aerial Image Matching for Visual Misinformation Detection Using Semantic Segmentation
- URL: http://arxiv.org/abs/2502.06288v2
- Date: Tue, 11 Feb 2025 11:25:19 GMT
- Title: Enhancing Ground-to-Aerial Image Matching for Visual Misinformation Detection Using Semantic Segmentation
- Authors: Emanuele Mule, Matteo Pannacci, Ali Ghasemi Goudarzi, Francesco Pro, Lorenzo Papa, Luca Maiano, Irene Amerini,
- Abstract summary: Recent advancements in generative AI techniques have raised serious concerns about the credibility of digital media available on the Internet.
To address these concerns, the ability to geolocate a non-geo-tagged ground-view image without external information, such as GPS coordinates, has become increasingly critical.
This study tackles the challenge of linking a ground-view image, potentially exhibiting varying fields of view (FoV), to its corresponding satellite image without the aid of GPS data.
- Score: 1.9055921262476347
- License:
- Abstract: The recent advancements in generative AI techniques, which have significantly increased the online dissemination of altered images and videos, have raised serious concerns about the credibility of digital media available on the Internet and distributed through information channels and social networks. This issue particularly affects domains that rely heavily on trustworthy data, such as journalism, forensic analysis, and Earth observation. To address these concerns, the ability to geolocate a non-geo-tagged ground-view image without external information, such as GPS coordinates, has become increasingly critical. This study tackles the challenge of linking a ground-view image, potentially exhibiting varying fields of view (FoV), to its corresponding satellite image without the aid of GPS data. To achieve this, we propose a novel four-stream Siamese-like architecture, the Quadruple Semantic Align Net (SAN-QUAD), which extends previous state-of-the-art (SOTA) approaches by leveraging semantic segmentation applied to both ground and satellite imagery. Experimental results on a subset of the CVUSA dataset demonstrate significant improvements of up to 9.8\% over prior methods across various FoV settings.
Related papers
- Game4Loc: A UAV Geo-Localization Benchmark from Game Data [0.0]
We introduce a more practical UAV geo-localization task including partial matches of cross-view paired data.
Experiments demonstrate the effectiveness of our data and training method for UAV geo-localization.
arXiv Detail & Related papers (2024-09-25T13:33:28Z) - Weakly-supervised Camera Localization by Ground-to-satellite Image Registration [52.54992898069471]
We propose a weakly supervised learning strategy for ground-to-satellite image registration.
It derives positive and negative satellite images for each ground image.
We also propose a self-supervision strategy for cross-view image relative rotation estimation.
arXiv Detail & Related papers (2024-09-10T12:57:16Z) - Geospecific View Generation -- Geometry-Context Aware High-resolution Ground View Inference from Satellite Views [5.146618378243241]
We propose a novel pipeline to generate geospecifc views that maximally respect the weak geometry and texture from multi-view satellite images.
Our method directly predicts ground-view images at geolocation by using a comprehensive set of information from the satellite image.
We demonstrate our pipeline is the first to generate close-to-real and geospecific ground views merely based on satellite images.
arXiv Detail & Related papers (2024-07-10T21:51:50Z) - Robust Disaster Assessment from Aerial Imagery Using Text-to-Image Synthetic Data [66.49494950674402]
We leverage emerging text-to-image generative models in creating large-scale synthetic supervision for the task of damage assessment from aerial images.
We build an efficient and easily scalable pipeline to generate thousands of post-disaster images from low-resource domains.
We validate the strength of our proposed framework under cross-geography domain transfer setting from xBD and SKAI images in both single-source and multi-source settings.
arXiv Detail & Related papers (2024-05-22T16:07:05Z) - A Semantic Segmentation-guided Approach for Ground-to-Aerial Image Matching [30.324252605889356]
This work addresses the problem of matching a query ground-view image with the corresponding satellite image without GPS data.
This is done by comparing the features from a ground-view image and a satellite one, innovatively leveraging the corresponding latter's segmentation mask through a three-stream Siamese-like network.
The novelty lies in the fusion of satellite images in combination with their semantic segmentation masks, aimed at ensuring that the model can extract useful features and focus on the significant parts of the images.
arXiv Detail & Related papers (2024-04-17T12:13:18Z) - SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation [69.42764583465508]
We explore the potential of generative image diffusion to address the scarcity of annotated data in earth observation tasks.
To the best of our knowledge, we are the first to generate both images and corresponding masks for satellite segmentation.
arXiv Detail & Related papers (2024-03-25T10:30:22Z) - Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve
Aerial Visual Perception? [57.77643186237265]
We present Multiview Aerial Visual RECognition or MAVREC, a video dataset where we record synchronized scenes from different perspectives.
MAVREC consists of around 2.5 hours of industry-standard 2.7K resolution video sequences, more than 0.5 million frames, and 1.1 million annotated bounding boxes.
This makes MAVREC the largest ground and aerial-view dataset, and the fourth largest among all drone-based datasets.
arXiv Detail & Related papers (2023-12-07T18:59:14Z) - Orientation-Guided Contrastive Learning for UAV-View Geo-Localisation [0.0]
We present an orientation-guided training framework for UAV-view geo-localisation.
We experimentally demonstrate that this prediction supports the training and outperforms previous approaches.
We achieve state-of-the-art results on both the University-1652 and University-160k datasets.
arXiv Detail & Related papers (2023-08-02T07:32:32Z) - Semantic Segmentation of Vegetation in Remote Sensing Imagery Using Deep
Learning [77.34726150561087]
We propose an approach for creating a multi-modal and large-temporal dataset comprised of publicly available Remote Sensing data.
We use Convolutional Neural Networks (CNN) models that are capable of separating different classes of vegetation.
arXiv Detail & Related papers (2022-09-28T18:51:59Z) - Geo-Localization via Ground-to-Satellite Cross-View Image Retrieval [25.93015219830576]
Given a ground-view image of a landmark, we aim to achieve cross-view geo-localization by searching out its corresponding satellite-view images.
We take advantage of drone-view information as a bridge between ground-view and satellite-view domains.
arXiv Detail & Related papers (2022-05-22T17:35:13Z) - Accurate 3-DoF Camera Geo-Localization via Ground-to-Satellite Image
Matching [102.39635336450262]
We address the problem of ground-to-satellite image geo-localization by matching a query image captured at the ground level against a large-scale database with geotagged satellite images.
Our new method is able to achieve the fine-grained location of a query image, up to pixel size precision of the satellite image.
arXiv Detail & Related papers (2022-03-26T20:10:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.