CroCo: Cross-Modal Contrastive learning for localization of Earth
Observation data
- URL: http://arxiv.org/abs/2204.07052v1
- Date: Thu, 14 Apr 2022 15:55:00 GMT
- Title: CroCo: Cross-Modal Contrastive learning for localization of Earth
Observation data
- Authors: Wei-Hsin Tseng, Ho\`ang-\^An L\^e, Alexandre Boulch, S\'ebastien
Lef\`evre, Dirk Tiede
- Abstract summary: It is of interest to localize a ground-based LiDAR point cloud on remote sensing imagery.
We propose a contrastive learning-based method that trains on DEM and high-resolution optical imagery.
In the best scenario, the Top-1 score of 0.71 and Top-5 score of 0.81 are obtained.
- Score: 62.96337162094726
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It is of interest to localize a ground-based LiDAR point cloud on remote
sensing imagery. In this work, we tackle a subtask of this problem, i.e. to map
a digital elevation model (DEM) rasterized from aerial LiDAR point cloud on the
aerial imagery. We proposed a contrastive learning-based method that trains on
DEM and high-resolution optical imagery and experiment the framework on
different data sampling strategies and hyperparameters. In the best scenario,
the Top-1 score of 0.71 and Top-5 score of 0.81 are obtained. The proposed
method is promising for feature learning from RGB and DEM for localization and
is potentially applicable to other data sources too. Source code will be
released at https://github.com/wtseng530/AVLocalization.
Related papers
- Radio-astronomical Image Reconstruction with Conditional Denoising
Diffusion Model [5.673449249014537]
Reconstructing sky models from dirty radio images is crucial for studying galaxy evolution at high redshift.
Current techniques, such as CLEAN and PyBDSF, often fail to detect faint sources.
This study proposes using neural networks to rebuild sky models directly from dirty images.
arXiv Detail & Related papers (2024-02-15T18:57:24Z) - Radio Map Estimation -- An Open Dataset with Directive Transmitter
Antennas and Initial Experiments [49.61405888107356]
We release a dataset of simulated path loss radio maps together with realistic city maps from real-world locations and aerial images from open datasources.
Initial experiments regarding model architectures, input feature design and estimation of radio maps from aerial images are presented.
arXiv Detail & Related papers (2024-01-12T14:56:45Z) - Point-SLAM: Dense Neural Point Cloud-based SLAM [61.96492935210654]
We propose a dense neural simultaneous localization and mapping (SLAM) approach for monocular RGBD input.
We demonstrate that both tracking and mapping can be performed with the same point-based neural scene representation.
arXiv Detail & Related papers (2023-04-09T16:48:26Z) - HPointLoc: Point-based Indoor Place Recognition using Synthetic RGB-D
Images [58.720142291102135]
We present a novel dataset named as HPointLoc, specially designed for exploring capabilities of visual place recognition in indoor environment.
The dataset is based on the popular Habitat simulator, in which it is possible to generate indoor scenes using both own sensor data and open datasets.
arXiv Detail & Related papers (2022-12-30T12:20:56Z) - Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD
Images [69.5662419067878]
Grounding referring expressions in RGBD image has been an emerging field.
We present a novel task of 3D visual grounding in single-view RGBD image where the referred objects are often only partially scanned due to occlusion.
Our approach first fuses the language and the visual features at the bottom level to generate a heatmap that localizes the relevant regions in the RGBD image.
Then our approach conducts an adaptive feature learning based on the heatmap and performs the object-level matching with another visio-linguistic fusion to finally ground the referred object.
arXiv Detail & Related papers (2021-03-14T11:18:50Z) - Rover Relocalization for Mars Sample Return by Virtual Template
Synthesis and Matching [48.0956967976633]
We consider the problem of rover relocalization in the context of the notional Mars Sample Return campaign.
In this campaign, a rover (R1) needs to be capable of autonomously navigating and localizing itself within an area of approximately 50 x 50 m.
We propose a visual localizer that exhibits robustness to the relatively barren terrain that we expect to find in relevant areas.
arXiv Detail & Related papers (2021-03-05T00:18:33Z) - RGB2LIDAR: Towards Solving Large-Scale Cross-Modal Visual Localization [20.350871370274238]
We study an important, yet largely unexplored problem of large-scale cross-modal visual localization.
We introduce a new dataset containing over 550K pairs of RGB and aerial LIDAR depth images.
We propose a novel joint embedding based method that effectively combines the appearance and semantic cues from both modalities.
arXiv Detail & Related papers (2020-09-12T01:18:45Z) - A Nearest Neighbor Network to Extract Digital Terrain Models from 3D
Point Clouds [1.6249267147413524]
We present an algorithm that operates on 3D-point clouds and estimates the underlying DTM for the scene using an end-to-end approach.
Our model learns neighborhood information and seamlessly integrates this with point-wise and block-wise global features.
arXiv Detail & Related papers (2020-05-21T15:54:55Z) - Evaluation of Cross-View Matching to Improve Ground Vehicle Localization
with Aerial Perception [17.349420462716886]
Cross-view matching refers to the problem of finding the closest match for a given query ground view image to one from a database of aerial images.
In this paper, we evaluate cross-view matching for the task of localizing a ground vehicle over a longer trajectory.
arXiv Detail & Related papers (2020-03-13T23:59:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.