Sat2Density: Faithful Density Learning from Satellite-Ground Image Pairs
- URL: http://arxiv.org/abs/2303.14672v2
- Date: Tue, 29 Aug 2023 09:33:59 GMT
- Title: Sat2Density: Faithful Density Learning from Satellite-Ground Image Pairs
- Authors: Ming Qian, Jincheng Xiong, Gui-Song Xia, Nan Xue
- Abstract summary: This paper aims to develop an accurate 3D geometry representation of satellite images using satellite-ground image pairs.
We draw inspiration from the density field representation used in volumetric neural rendering and propose a new approach, called Sat2Density.
Our method utilizes the properties of ground-view panoramas for the sky and non-sky regions to learn faithful density fields of 3D scenes in a geometric perspective.
- Score: 32.4349978810128
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper aims to develop an accurate 3D geometry representation of
satellite images using satellite-ground image pairs. Our focus is on the
challenging problem of 3D-aware ground-views synthesis from a satellite image.
We draw inspiration from the density field representation used in volumetric
neural rendering and propose a new approach, called Sat2Density. Our method
utilizes the properties of ground-view panoramas for the sky and non-sky
regions to learn faithful density fields of 3D scenes in a geometric
perspective. Unlike other methods that require extra depth information during
training, our Sat2Density can automatically learn accurate and faithful 3D
geometry via density representation without depth supervision. This advancement
significantly improves the ground-view panorama synthesis task. Additionally,
our study provides a new geometric perspective to understand the relationship
between satellite and ground-view images in 3D space.
Related papers
- Advancing Applications of Satellite Photogrammetry: Novel Approaches for Built-up Area Modeling and Natural Environment Monitoring using Stereo/Multi-view Satellite Image-derived 3D Data [0.0]
This dissertation explores several novel approaches based on stereo and multi-view satellite image-derived 3D geospatial data.
It introduces four parts of novel approaches that deal with the spatial and temporal challenges with satellite-derived 3D data.
Overall, this dissertation demonstrates the extensive potential of satellite photogrammetry applications in addressing urban and environmental challenges.
arXiv Detail & Related papers (2024-04-18T20:02:52Z) - HUGS: Holistic Urban 3D Scene Understanding via Gaussian Splatting [53.6394928681237]
holistic understanding of urban scenes based on RGB images is a challenging yet important problem.
Our main idea involves the joint optimization of geometry, appearance, semantics, and motion using a combination of static and dynamic 3D Gaussians.
Our approach offers the ability to render new viewpoints in real-time, yielding 2D and 3D semantic information with high accuracy.
arXiv Detail & Related papers (2024-03-19T13:39:05Z) - Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion [77.34078223594686]
We propose a novel architecture for direct 3D scene generation by introducing diffusion models into 3D sparse representations and combining them with neural rendering techniques.
Specifically, our approach generates texture colors at the point level for a given geometry using a 3D diffusion model first, which is then transformed into a scene representation in a feed-forward manner.
Experiments in two city-scale datasets show that our model demonstrates proficiency in generating photo-realistic street-view image sequences and cross-view urban scenes from satellite imagery.
arXiv Detail & Related papers (2024-01-19T16:15:37Z) - Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment [26.858034573776198]
We propose a weakly supervised approach for 3D visual grounding based on Visual Linguistic Alignment.
Our 3D-VLA exploits the superior ability of current large-scale vision-language models on aligning the semantics between texts and 2D images.
During the inference stage, the learned text-3D correspondence will help us ground the text queries to the 3D target objects even without 2D images.
arXiv Detail & Related papers (2023-12-15T09:08:14Z) - NeurOCS: Neural NOCS Supervision for Monocular 3D Object Localization [80.3424839706698]
We present NeurOCS, a framework that uses instance masks 3D boxes as input to learn 3D object shapes by means of differentiable rendering.
Our approach rests on insights in learning a category-level shape prior directly from real driving scenes.
We make critical design choices to learn object coordinates more effectively from an object-centric view.
arXiv Detail & Related papers (2023-05-28T16:18:41Z) - SAT: 2D Semantics Assisted Training for 3D Visual Grounding [95.84637054325039]
3D visual grounding aims at grounding a natural language description about a 3D scene, usually represented in the form of 3D point clouds, to the targeted object region.
Point clouds are sparse, noisy, and contain limited semantic information compared with 2D images.
We propose 2D Semantics Assisted Training (SAT) that utilizes 2D image semantics in the training stage to ease point-cloud-language joint representation learning.
arXiv Detail & Related papers (2021-05-24T17:58:36Z) - Geometry-Guided Street-View Panorama Synthesis from Satellite Imagery [80.6282101835164]
We present a new approach for synthesizing a novel street-view panorama given an overhead satellite image.
Our method generates a Google's omnidirectional street-view type panorama, as if it is captured from the same geographical location as the center of the satellite patch.
arXiv Detail & Related papers (2021-03-02T10:27:05Z) - Learning Depth With Very Sparse Supervision [57.911425589947314]
This paper explores the idea that perception gets coupled to 3D properties of the world via interaction with the environment.
We train a specialized global-local network architecture with what would be available to a robot interacting with the environment.
Experiments on several datasets show that, when ground truth is available even for just one of the image pixels, the proposed network can learn monocular dense depth estimation up to 22.5% more accurately than state-of-the-art approaches.
arXiv Detail & Related papers (2020-03-02T10:44:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.