Related papers: SE-MD: A Single-encoder multiple-decoder deep network for point cloud generation from 2D images

SE-MD: A Single-encoder multiple-decoder deep network for point cloud generation from 2D images

URL: http://arxiv.org/abs/2106.15325v1
Date: Thu, 17 Jun 2021 10:48:46 GMT
Title: SE-MD: A Single-encoder multiple-decoder deep network for point cloud generation from 2D images
Authors: Abdul Mueed Hafiz, Rouf Ul Alam Bhat, Shabir Ahmad Parah, M. Hassaballah
Abstract summary: 3D model generation from single 2D RGB images is a challenging and actively researched computer vision task. There are various issues like using inefficient 3D representation formats, weak 3D model generation backbones, inability to generate dense point clouds. A novel 2D RGB image to point cloud conversion technique is proposed, which improves the state of art in the field.
Score: 2.4087148947930634
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: 3D model generation from single 2D RGB images is a challenging and actively researched computer vision task. Various techniques using conventional network architectures have been proposed for the same. However, the body of research work is limited and there are various issues like using inefficient 3D representation formats, weak 3D model generation backbones, inability to generate dense point clouds, dependence of post-processing for generation of dense point clouds, and dependence on silhouettes in RGB images. In this paper, a novel 2D RGB image to point cloud conversion technique is proposed, which improves the state of art in the field due to its efficient, robust and simple model by using the concept of parallelization in network architecture. It not only uses the efficient and rich 3D representation of point clouds, but also uses a novel and robust point cloud generation backbone in order to address the prevalent issues. This involves using a single-encoder multiple-decoder deep network architecture wherein each decoder generates certain fixed viewpoints. This is followed by fusing all the viewpoints to generate a dense point cloud. Various experiments are conducted on the technique and its performance is compared with those of other state of the art techniques and impressive gains in performance are demonstrated. Code is available at https://github.com/mueedhafiz1982/

Related papers

GaussianPU: A Hybrid 2D-3D Upsampling Framework for Enhancing Color Point Clouds via 3D Gaussian Splatting [11.60605616190011]
We propose a novel 2D-3D hybrid colored point cloud upsampling framework (GaussianPU) based on 3D Gaussian Splatting (3DGS) for robotic perception. A dual scale rendered image restoration network transforms sparse point cloud renderings into dense representations. We have made a series of enhancements to the vanilla 3DGS, enabling precise control over the number of points.
arXiv Detail & Related papers (2024-09-03T03:35:04Z)
Point Cloud Self-supervised Learning via 3D to Multi-view Masked Autoencoder [21.73287941143304]
Multi-Modality Masked AutoEncoders (MAE) methods leverage both 2D images and 3D point clouds for pre-training. We introduce a novel approach employing a 3D to multi-view masked autoencoder to fully harness the multi-modal attributes of 3D point clouds. Our method outperforms state-of-the-art counterparts by a large margin in a variety of downstream tasks.
arXiv Detail & Related papers (2023-11-17T22:10:03Z)
Pyramid Deep Fusion Network for Two-Hand Reconstruction from RGB-D Images [11.100398985633754]
We propose an end-to-end framework for recovering dense meshes for both hands. Our framework employs ResNet50 and PointNet++ to derive features from RGB and point cloud. We also introduce a novel pyramid deep fusion network (PDFNet) to aggregate features at different scales.
arXiv Detail & Related papers (2023-07-12T09:33:21Z)
TriVol: Point Cloud Rendering via Triple Volumes [57.305748806545026]
We present a dense while lightweight 3D representation, named TriVol, that can be combined with NeRF to render photo-realistic images from point clouds. Our framework has excellent generalization ability to render a category of scenes/objects without fine-tuning.
arXiv Detail & Related papers (2023-03-29T06:34:12Z)
Ponder: Point Cloud Pre-training via Neural Rendering [93.34522605321514]
We propose a novel approach to self-supervised learning of point cloud representations by differentiable neural encoders. The learned point-cloud can be easily integrated into various downstream tasks, including not only high-level rendering tasks like 3D detection and segmentation, but low-level tasks like 3D reconstruction and image rendering.
arXiv Detail & Related papers (2022-12-31T08:58:39Z)
PointMCD: Boosting Deep Point Cloud Encoders via Multi-view Cross-modal Distillation for 3D Shape Recognition [55.38462937452363]
We propose a unified multi-view cross-modal distillation architecture, including a pretrained deep image encoder as the teacher and a deep point encoder as the student. By pair-wise aligning multi-view visual and geometric descriptors, we can obtain more powerful deep point encoders without exhausting and complicated network modification.
arXiv Detail & Related papers (2022-07-07T07:23:20Z)
Voint Cloud: Multi-View Point Cloud Representation for 3D Understanding [80.04281842702294]
We introduce the concept of the multi-view point cloud (Voint cloud) representing each 3D point as a set of features extracted from several view-points. This novel 3D Voint cloud representation combines the compactness of 3D point cloud representation with the natural view-awareness of multi-view representation. We deploy a Voint neural network (VointNet) with a theoretically established functional form to learn representations in the Voint space.
arXiv Detail & Related papers (2021-11-30T13:08:19Z)
TreeGCN-ED: Encoding Point Cloud using a Tree-Structured Graph Network [24.299931323012757]
This work proposes an autoencoder based framework to generate robust embeddings for point clouds. We demonstrate the applicability of the proposed framework in applications like: 3D point cloud completion and Single image based 3D reconstruction.
arXiv Detail & Related papers (2021-10-07T03:52:56Z)
ParaNet: Deep Regular Representation for 3D Point Clouds [62.81379889095186]
ParaNet is a novel end-to-end deep learning framework for representing 3D point clouds. It converts an irregular 3D point cloud into a regular 2D color image, named point geometry image (PGI) In contrast to conventional regular representation modalities based on multi-view projection and voxelization, the proposed representation is differentiable and reversible.
arXiv Detail & Related papers (2020-12-05T13:19:55Z)
ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes [93.82668222075128]
We propose a 3D detection architecture called ImVoteNet for RGB-D scenes. ImVoteNet is based on fusing 2D votes in images and 3D votes in point clouds. We validate our model on the challenging SUN RGB-D dataset.
arXiv Detail & Related papers (2020-01-29T05:09:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.