Related papers: OrthoPlanes: A Novel Representation for Better 3D-Awareness of GANs

OrthoPlanes: A Novel Representation for Better 3D-Awareness of GANs

URL: http://arxiv.org/abs/2309.15830v1
Date: Wed, 27 Sep 2023 17:52:39 GMT
Title: OrthoPlanes: A Novel Representation for Better 3D-Awareness of GANs
Authors: Honglin He, Zhuoqian Yang, Shikai Li, Bo Dai, Wayne Wu
Abstract summary: We present a new method for generating realistic and view-consistent images with fine geometry from 2D image collections. Our method proposes a hybrid explicit-implicit representation called textbfOrthoPlanes, which encodes fine-grained 3D information in feature maps.
Score: 34.00559090962427
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a new method for generating realistic and view-consistent images with fine geometry from 2D image collections. Our method proposes a hybrid explicit-implicit representation called \textbf{OrthoPlanes}, which encodes fine-grained 3D information in feature maps that can be efficiently generated by modifying 2D StyleGANs. Compared to previous representations, our method has better scalability and expressiveness with clear and explicit information. As a result, our method can handle more challenging view-angles and synthesize articulated objects with high spatial degree of freedom. Experiments demonstrate that our method achieves state-of-the-art results on FFHQ and SHHQ datasets, both quantitatively and qualitatively. Project page: \url{https://orthoplanes.github.io/}.

Related papers

2D Triangle Splatting for Direct Differentiable Mesh Training [4.161453036693641]
2D Triangle Splatting (2DTS) is a novel method that replaces 3D Gaussian primitives with 2D triangle facelets.<n>By incorporating a compactness parameter into the triangle primitives, we enable direct training of photorealistic meshes.<n>Our approach produces reconstructed meshes with superior visual quality compared to existing mesh reconstruction methods.
arXiv Detail & Related papers (2025-06-23T12:26:47Z)
PartGS:Learning Part-aware 3D Representations by Fusing 2D Gaussians and Superquadrics [16.446659867133977]
Low-level 3D representations, such as point clouds, meshes, NeRFs, and 3D Gaussians, are commonly used to represent 3D objects or scenes. We introduce $textbfPartGS$, $textbfPart$-aware 3D reconstruction by a hybrid representation of 2D $textbfG$aussians and $textbfS$uperquadrics.
arXiv Detail & Related papers (2024-08-20T12:30:37Z)
Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models [57.37244894146089]
We propose Diff2Scene, which leverages frozen representations from text-image generative models, along with salient-aware and geometric-aware masks, for open-vocabulary 3D semantic segmentation and visual grounding tasks. We show that it outperforms competitive baselines and achieves significant improvements over state-of-the-art methods.
arXiv Detail & Related papers (2024-07-18T16:20:56Z)
GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction [52.04103235260539]
We present a diffusion model approach based on Gaussian Splatting representation for 3D object reconstruction from a single view. The model learns to generate 3D objects represented by sets of GS ellipsoids. The final reconstructed objects explicitly come with high-quality 3D structure and texture, and can be efficiently rendered in arbitrary views.
arXiv Detail & Related papers (2024-07-05T03:43:08Z)
ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models [65.22994156658918]
We present a method that learns to generate multi-view images in a single denoising process from real-world data. We design an autoregressive generation that renders more 3D-consistent images at any viewpoint.
arXiv Detail & Related papers (2024-03-04T07:57:05Z)
Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation [37.93542778715304]
We present a sketch-guided text-to-3D generation framework (namely, Sketch2NeRF) to add sketch control to 3D generation. Our method achieves state-of-the-art performance in terms of sketch similarity and text alignment.
arXiv Detail & Related papers (2024-01-25T15:49:12Z)
3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with 2D Diffusion Models [102.75875255071246]
3D content creation via text-driven stylization has played a fundamental challenge to multimedia and graphics community. We propose a new 3DStyle-Diffusion model that triggers fine-grained stylization of 3D meshes with additional controllable appearance and geometric guidance from 2D Diffusion models.
arXiv Detail & Related papers (2023-11-09T15:51:27Z)
Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models. Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z)
Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors [104.79392615848109]
We present Magic123, a two-stage coarse-to-fine approach for high-quality, textured 3D meshes from a single unposed image. In the first stage, we optimize a neural radiance field to produce a coarse geometry. In the second stage, we adopt a memory-efficient differentiable mesh representation to yield a high-resolution mesh with a visually appealing texture.
arXiv Detail & Related papers (2023-06-30T17:59:08Z)
Improved Modeling of 3D Shapes with Multi-view Depth Maps [48.8309897766904]
We present a general-purpose framework for modeling 3D shapes using CNNs. Using just a single depth image of the object, we can output a dense multi-view depth map representation of 3D objects.
arXiv Detail & Related papers (2020-09-07T17:58:27Z)
Convolutional Generation of Textured 3D Meshes [34.20939983046376]
We propose a framework that can generate triangle meshes and associated high-resolution texture maps, using only 2D supervision from single-view natural images. A key contribution of our work is the encoding of the mesh and texture as 2D representations, which are semantically aligned and can be easily modeled by a 2D convolutional GAN. We demonstrate the efficacy of our method on Pascal3D+ Cars and CUB, both in an unconditional setting and in settings where the model is conditioned on class labels, attributes, and text.
arXiv Detail & Related papers (2020-06-13T15:23:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.