Attention-based 3D Object Reconstruction from a Single Image
- URL: http://arxiv.org/abs/2008.04738v1
- Date: Tue, 11 Aug 2020 14:51:18 GMT
- Title: Attention-based 3D Object Reconstruction from a Single Image
- Authors: Andrey Salvi and Nathan Gavenski and Eduardo Pooch and Felipe
Tasoniero and Rodrigo Barros
- Abstract summary: We propose to substantially improve Occupancy Networks, a state-of-the-art method for 3D object reconstruction.
We apply the concept of self-attention within the network's encoder in order to leverage complementary input features.
We were able to improve the original work in 5.05% of mesh IoU, 0.83% of Normal Consistency, and more than 10X the Chamfer-L1 distance.
- Score: 0.2519906683279153
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, learning-based approaches for 3D reconstruction from 2D images have
gained popularity due to its modern applications, e.g., 3D printers, autonomous
robots, self-driving cars, virtual reality, and augmented reality. The computer
vision community has applied a great effort in developing functions to
reconstruct the full 3D geometry of objects and scenes. However, to extract
image features, they rely on convolutional neural networks, which are
ineffective in capturing long-range dependencies. In this paper, we propose to
substantially improve Occupancy Networks, a state-of-the-art method for 3D
object reconstruction. For such we apply the concept of self-attention within
the network's encoder in order to leverage complementary input features rather
than those based on local regions, helping the encoder to extract global
information. With our approach, we were capable of improving the original work
in 5.05% of mesh IoU, 0.83% of Normal Consistency, and more than 10X the
Chamfer-L1 distance. We also perform a qualitative study that shows that our
approach was able to generate much more consistent meshes, confirming its
increased generalization power over the current state-of-the-art.
Related papers
- AutoDecoding Latent 3D Diffusion Models [95.7279510847827]
We present a novel approach to the generation of static and articulated 3D assets that has a 3D autodecoder at its core.
The 3D autodecoder framework embeds properties learned from the target dataset in the latent space.
We then identify the appropriate intermediate volumetric latent space, and introduce robust normalization and de-normalization operations.
arXiv Detail & Related papers (2023-07-07T17:59:14Z) - Farm3D: Learning Articulated 3D Animals by Distilling 2D Diffusion [67.71624118802411]
We present Farm3D, a method for learning category-specific 3D reconstructors for articulated objects.
We propose a framework that uses an image generator, such as Stable Diffusion, to generate synthetic training data.
Our network can be used for analysis, including monocular reconstruction, or for synthesis, generating articulated assets for real-time applications such as video games.
arXiv Detail & Related papers (2023-04-20T17:59:34Z) - Multiview Compressive Coding for 3D Reconstruction [77.95706553743626]
We introduce a simple framework that operates on 3D points of single objects or whole scenes.
Our model, Multiview Compressive Coding, learns to compress the input appearance and geometry to predict the 3D structure.
arXiv Detail & Related papers (2023-01-19T18:59:52Z) - Visual Reinforcement Learning with Self-Supervised 3D Representations [15.991546692872841]
We present a unified framework for self-supervised learning of 3D representations for motor control.
Our method enjoys improved sample efficiency in simulated manipulation tasks compared to 2D representation learning methods.
arXiv Detail & Related papers (2022-10-13T17:59:55Z) - Simple and Effective Synthesis of Indoor 3D Scenes [78.95697556834536]
We study the problem of immersive 3D indoor scenes from one or more images.
Our aim is to generate high-resolution images and videos from novel viewpoints.
We propose an image-to-image GAN that maps directly from reprojections of incomplete point clouds to full high-resolution RGB-D images.
arXiv Detail & Related papers (2022-04-06T17:54:46Z) - Efficient Geometry-aware 3D Generative Adversarial Networks [50.68436093869381]
Existing 3D GANs are either compute-intensive or make approximations that are not 3D-consistent.
In this work, we improve the computational efficiency and image quality of 3D GANs without overly relying on these approximations.
We introduce an expressive hybrid explicit-implicit network architecture that synthesizes not only high-resolution multi-view-consistent images in real time but also produces high-quality 3D geometry.
arXiv Detail & Related papers (2021-12-15T08:01:43Z) - D-OccNet: Detailed 3D Reconstruction Using Cross-Domain Learning [0.0]
We extend the work on Occupancy Networks by exploiting cross-domain learning of image and point cloud domains.
Our network, the Double Occupancy Network (D-OccNet) outperforms Occupancy Networks in terms of visual quality and details captured in the 3D reconstruction.
arXiv Detail & Related papers (2021-04-28T16:00:54Z) - Improved Modeling of 3D Shapes with Multi-view Depth Maps [48.8309897766904]
We present a general-purpose framework for modeling 3D shapes using CNNs.
Using just a single depth image of the object, we can output a dense multi-view depth map representation of 3D objects.
arXiv Detail & Related papers (2020-09-07T17:58:27Z) - PerMO: Perceiving More at Once from a Single Image for Autonomous
Driving [76.35684439949094]
We present a novel approach to detect, segment, and reconstruct complete textured 3D models of vehicles from a single image.
Our approach combines the strengths of deep learning and the elegance of traditional techniques.
We have integrated these algorithms with an autonomous driving system.
arXiv Detail & Related papers (2020-07-16T05:02:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.