ShapeClipper: Scalable 3D Shape Learning from Single-View Images via
Geometric and CLIP-based Consistency
- URL: http://arxiv.org/abs/2304.06247v1
- Date: Thu, 13 Apr 2023 03:53:12 GMT
- Title: ShapeClipper: Scalable 3D Shape Learning from Single-View Images via
Geometric and CLIP-based Consistency
- Authors: Zixuan Huang, Varun Jampani, Anh Thai, Yuanzhen Li, Stefan Stojanov,
James M. Rehg
- Abstract summary: We present ShapeClipper, a novel method that reconstructs 3D object shapes from real-world single-view RGB images.
ShapeClipper learns shape reconstruction from a set of single-view segmented images.
We evaluate our method over three challenging real-world datasets.
- Score: 39.7058456335011
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present ShapeClipper, a novel method that reconstructs 3D object shapes
from real-world single-view RGB images. Instead of relying on laborious 3D,
multi-view or camera pose annotation, ShapeClipper learns shape reconstruction
from a set of single-view segmented images. The key idea is to facilitate shape
learning via CLIP-based shape consistency, where we encourage objects with
similar CLIP encodings to share similar shapes. We also leverage off-the-shelf
normals as an additional geometric constraint so the model can learn better
bottom-up reasoning of detailed surface geometry. These two novel consistency
constraints, when used to regularize our model, improve its ability to learn
both global shape structure and local geometric details. We evaluate our method
over three challenging real-world datasets, Pix3D, Pascal3D+, and OpenImages,
where we achieve superior performance over state-of-the-art methods.
Related papers
- DECOLLAGE: 3D Detailization by Controllable, Localized, and Learned Geometry Enhancement [38.719572669042925]
We present a 3D modeling method which enables end-users to refine or detailize 3D shapes using machine learning.
We show that our ability to localize details enables novel interactive creative and applications.
arXiv Detail & Related papers (2024-09-10T00:51:49Z) - ShaDDR: Interactive Example-Based Geometry and Texture Generation via 3D
Shape Detailization and Differentiable Rendering [24.622120688131616]
ShaDDR is an example-based deep generative neural network which produces a high-resolution textured 3D shape.
Our method learns to detailize the geometry via multi-resolution voxel upsampling and generate textures on voxel surfaces.
The generated shape preserves the overall structure of the input coarse voxel model.
arXiv Detail & Related papers (2023-06-08T02:35:30Z) - Geo-SIC: Learning Deformable Geometric Shapes in Deep Image Classifiers [8.781861951759948]
This paper presents Geo-SIC, the first deep learning model to learn deformable shapes in a deformation space for an improved performance of image classification.
We introduce a newly designed framework that (i) simultaneously derives features from both image and latent shape spaces with large intra-class variations.
We develop a boosted classification network, equipped with an unsupervised learning of geometric shape representations.
arXiv Detail & Related papers (2022-10-25T01:55:17Z) - ISS: Image as Stetting Stone for Text-Guided 3D Shape Generation [91.37036638939622]
This paper presents a new framework called Image as Stepping Stone (ISS) for the task by introducing 2D image as a stepping stone to connect the two modalities.
Our key contribution is a two-stage feature-space-alignment approach that maps CLIP features to shapes.
We formulate a text-guided shape stylization module to dress up the output shapes with novel textures.
arXiv Detail & Related papers (2022-09-09T06:54:21Z) - Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance
Consistency [59.427074701985795]
Single-view reconstruction typically rely on viewpoint annotations, silhouettes, the absence of background, multiple views of the same instance, a template shape, or symmetry.
We avoid all of these supervisions and hypotheses by leveraging explicitly the consistency between images of different object instances.
Our main contributions are two approaches to leverage cross-instance consistency: (i) progressive conditioning, a training strategy to gradually specialize the model from category to instances in a curriculum learning fashion; (ii) swap reconstruction, a loss enforcing consistency between instances having similar shape or texture.
arXiv Detail & Related papers (2022-04-21T17:47:35Z) - Learning Canonical 3D Object Representation for Fine-Grained Recognition [77.33501114409036]
We propose a novel framework for fine-grained object recognition that learns to recover object variation in 3D space from a single image.
We represent an object as a composition of 3D shape and its appearance, while eliminating the effect of camera viewpoint.
By incorporating 3D shape and appearance jointly in a deep representation, our method learns the discriminative representation of the object.
arXiv Detail & Related papers (2021-08-10T12:19:34Z) - Discovering 3D Parts from Image Collections [98.16987919686709]
We tackle the problem of 3D part discovery from only 2D image collections.
Instead of relying on manually annotated parts for supervision, we propose a self-supervised approach.
Our key insight is to learn a novel part shape prior that allows each part to fit an object shape faithfully while constrained to have simple geometry.
arXiv Detail & Related papers (2021-07-28T20:29:16Z) - Self-Supervised 2D Image to 3D Shape Translation with Disentangled
Representations [92.89846887298852]
We present a framework to translate between 2D image views and 3D object shapes.
We propose SIST, a Self-supervised Image to Shape Translation framework.
arXiv Detail & Related papers (2020-03-22T22:44:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.