GECCO: Geometrically-Conditioned Point Diffusion Models
- URL: http://arxiv.org/abs/2303.05916v2
- Date: Mon, 25 Sep 2023 14:28:21 GMT
- Title: GECCO: Geometrically-Conditioned Point Diffusion Models
- Authors: Micha{\l} J. Tyszkiewicz, Pascal Fua, Eduard Trulls
- Abstract summary: Diffusion models generating images conditionally on text have recently made a splash far beyond the computer vision community.
Here, we tackle the related problem of generating point clouds, both unconditionally, and conditionally with images.
For the latter, we introduce a novel geometrically-motivated conditioning scheme based on projecting sparse image features into the point cloud.
- Score: 60.28388617034254
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models generating images conditionally on text, such as Dall-E 2
and Stable Diffusion, have recently made a splash far beyond the computer
vision community. Here, we tackle the related problem of generating point
clouds, both unconditionally, and conditionally with images. For the latter, we
introduce a novel geometrically-motivated conditioning scheme based on
projecting sparse image features into the point cloud and attaching them to
each individual point, at every step in the denoising process. This approach
improves geometric consistency and yields greater fidelity than current methods
relying on unstructured, global latent codes. Additionally, we show how to
apply recent continuous-time diffusion schemes. Our method performs on par or
above the state of art on conditional and unconditional experiments on
synthetic data, while being faster, lighter, and delivering tractable
likelihoods. We show it can also scale to diverse indoors scenes.
Related papers
- MultiDiff: Consistent Novel View Synthesis from a Single Image [60.04215655745264]
MultiDiff is a novel approach for consistent novel view synthesis of scenes from a single RGB image.
Our results demonstrate that MultiDiff outperforms state-of-the-art methods on the challenging, real-world datasets RealEstate10K and ScanNet.
arXiv Detail & Related papers (2024-06-26T17:53:51Z) - Binary Opacity Grids: Capturing Fine Geometric Detail for Mesh-Based
View Synthesis [70.40950409274312]
We modify density fields to encourage them to converge towards surfaces, without compromising their ability to reconstruct thin structures.
We also develop a fusion-based meshing strategy followed by mesh simplification and appearance model fitting.
The compact meshes produced by our model can be rendered in real-time on mobile devices.
arXiv Detail & Related papers (2024-02-19T18:59:41Z) - Improving Diffusion-Based Image Synthesis with Context Prediction [49.186366441954846]
Existing diffusion models mainly try to reconstruct input image from a corrupted one with a pixel-wise or feature-wise constraint along spatial axes.
We propose ConPreDiff to improve diffusion-based image synthesis with context prediction.
Our ConPreDiff consistently outperforms previous methods and achieves a new SOTA text-to-image generation results on MS-COCO, with a zero-shot FID score of 6.21.
arXiv Detail & Related papers (2024-01-04T01:10:56Z) - BuilDiff: 3D Building Shape Generation using Single-Image Conditional
Point Cloud Diffusion Models [15.953480573461519]
We propose a novel 3D building shape generation method exploiting point cloud diffusion models with image conditioning schemes.
We validate our framework on two newly built datasets and extensive experiments show that our method outperforms previous works in terms of building generation quality.
arXiv Detail & Related papers (2023-08-31T22:17:48Z) - PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle
Adjustment [21.98302129015761]
We propose to formulate the Structure from Motion (SfM) problem inside a probabilistic diffusion framework.
We show that our method PoseDiffusion significantly improves over the classic SfM pipelines.
It is observed that our method can generalize across datasets without further training.
arXiv Detail & Related papers (2023-06-27T17:59:07Z) - $PC^2$: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D
Reconstruction [97.06927852165464]
Reconstructing the 3D shape of an object from a single RGB image is a long-standing and highly challenging problem in computer vision.
We propose a novel method for single-image 3D reconstruction which generates a sparse point cloud via a conditional denoising diffusion process.
arXiv Detail & Related papers (2023-02-21T13:37:07Z) - Geometry of Score Based Generative Models [2.4078030278859113]
We look at Score-based generative models (also called diffusion generative models) from a geometric perspective.
We prove that both the forward and backward process of adding noise and generating from noise are Wasserstein gradient flow in the space of probability measures.
arXiv Detail & Related papers (2023-02-09T02:39:11Z) - Deep Equilibrium Approaches to Diffusion Models [1.4275201654498746]
Diffusion-based generative models are extremely effective in generating high-quality images.
These models typically require long sampling chains to produce high-fidelity images.
We look at diffusion models through a different perspective, that of a (deep) equilibrium (DEQ) fixed point model.
arXiv Detail & Related papers (2022-10-23T22:02:19Z) - Non-Homogeneous Haze Removal via Artificial Scene Prior and
Bidimensional Graph Reasoning [52.07698484363237]
We propose a Non-Homogeneous Haze Removal Network (NHRN) via artificial scene prior and bidimensional graph reasoning.
Our method achieves superior performance over many state-of-the-art algorithms for both the single image dehazing and hazy image understanding tasks.
arXiv Detail & Related papers (2021-04-05T13:04:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.