Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a
Single Image using Diffusion Models
- URL: http://arxiv.org/abs/2303.11444v2
- Date: Fri, 8 Sep 2023 00:36:02 GMT
- Title: Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a
Single Image using Diffusion Models
- Authors: Divya Kothandaraman, Tianyi Zhou, Ming Lin, Dinesh Manocha
- Abstract summary: We present a novel method, Aerial Diffusion, for generating aerial views from a single ground-view image using text guidance.
We address two main challenges corresponding to domain gap between the ground-view and the aerial view.
Aerial Diffusion is the first approach that performs ground-to-aerial translation in an unsupervised manner.
- Score: 72.76182801289497
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel method, Aerial Diffusion, for generating aerial views from
a single ground-view image using text guidance. Aerial Diffusion leverages a
pretrained text-image diffusion model for prior knowledge. We address two main
challenges corresponding to domain gap between the ground-view and the aerial
view and the two views being far apart in the text-image embedding manifold.
Our approach uses a homography inspired by inverse perspective mapping prior to
finetuning the pretrained diffusion model. Additionally, using the text
corresponding to the ground-view to finetune the model helps us capture the
details in the ground-view image at a relatively low bias towards the
ground-view image. Aerial Diffusion uses an alternating sampling strategy to
compute the optimal solution on complex high-dimensional manifold and generate
a high-fidelity (w.r.t. ground view) aerial image. We demonstrate the quality
and versatility of Aerial Diffusion on a plethora of images from various
domains including nature, human actions, indoor scenes, etc. We qualitatively
prove the effectiveness of our method with extensive ablations and comparisons.
To the best of our knowledge, Aerial Diffusion is the first approach that
performs ground-to-aerial translation in an unsupervised manner.
Related papers
- Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers [120.49126407479717]
This paper explores text-to-image diffusion models for Zero-Shot Sketch-based Image Retrieval (ZS-SBIR)
We highlight a pivotal discovery: the capacity of text-to-image diffusion models to seamlessly bridge the gap between sketches and photos.
arXiv Detail & Related papers (2024-03-12T00:02:03Z) - HawkI: Homography & Mutual Information Guidance for 3D-free Single Image to Aerial View [67.8213192993001]
We present HawkI, for synthesizing aerial-view images from text and an exemplar image.
HawkI blends the visual features from the input image within a pretrained text-to-2Dimage stable diffusion model.
At inference, HawkI employs a unique mutual information guidance formulation to steer the generated image towards faithfully replicating the semantic details of the input-image.
arXiv Detail & Related papers (2023-11-27T01:41:25Z) - Deceptive-NeRF/3DGS: Diffusion-Generated Pseudo-Observations for High-Quality Sparse-View Reconstruction [60.52716381465063]
We introduce Deceptive-NeRF/3DGS to enhance sparse-view reconstruction with only a limited set of input images.
Specifically, we propose a deceptive diffusion model turning noisy images rendered from few-view reconstructions into high-quality pseudo-observations.
Our system progressively incorporates diffusion-generated pseudo-observations into the training image sets, ultimately densifying the sparse input observations by 5 to 10 times.
arXiv Detail & Related papers (2023-05-24T14:00:32Z) - ADIR: Adaptive Diffusion for Image Reconstruction [46.838084286784195]
We propose a conditional sampling scheme that exploits the prior learned by diffusion models.
We then combine it with a novel approach for adapting pretrained diffusion denoising networks to their input.
We show that our proposed adaptive diffusion for image reconstruction' approach achieves a significant improvement in the super-resolution, deblurring, and text-based editing tasks.
arXiv Detail & Related papers (2022-12-06T18:39:58Z) - SinDiffusion: Learning a Diffusion Model from a Single Natural Image [159.4285444680301]
We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image.
It is based on two core designs. First, SinDiffusion is trained with a single model at a single scale instead of multiple models with progressive growing of scales.
Second, we identify that a patch-level receptive field of the diffusion network is crucial and effective for capturing the image's patch statistics.
arXiv Detail & Related papers (2022-11-22T18:00:03Z) - Non-Homogeneous Haze Removal via Artificial Scene Prior and
Bidimensional Graph Reasoning [52.07698484363237]
We propose a Non-Homogeneous Haze Removal Network (NHRN) via artificial scene prior and bidimensional graph reasoning.
Our method achieves superior performance over many state-of-the-art algorithms for both the single image dehazing and hazy image understanding tasks.
arXiv Detail & Related papers (2021-04-05T13:04:44Z) - Leveraging Photogrammetric Mesh Models for Aerial-Ground Feature Point
Matching Toward Integrated 3D Reconstruction [19.551088857830944]
Integration of aerial and ground images has been proved as an efficient approach to enhance the surface reconstruction in urban environments.
Previous studies based on geometry-aware image rectification have alleviated this problem.
We propose a novel approach: leveraging photogrammetric mesh models for aerial-ground image matching.
arXiv Detail & Related papers (2020-02-21T01:47:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.