SDFit: 3D Object Pose and Shape by Fitting a Morphable SDF to a Single Image
- URL: http://arxiv.org/abs/2409.16178v1
- Date: Tue, 24 Sep 2024 15:22:04 GMT
- Title: SDFit: 3D Object Pose and Shape by Fitting a Morphable SDF to a Single Image
- Authors: Dimitrije Antić, Sai Kumar Dwivedi, Shashank Tripathi, Theo Gevers, Dimitrios Tzionas,
- Abstract summary: We focus on recovering 3D object pose and shape from single images.
Recent work relies mostly on learning from finite datasets, so it struggles generalizing.
We tackle these limitations with a novel framework, called SDFit.
- Score: 19.704369289729897
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We focus on recovering 3D object pose and shape from single images. This is highly challenging due to strong (self-)occlusions, depth ambiguities, the enormous shape variance, and lack of 3D ground truth for natural images. Recent work relies mostly on learning from finite datasets, so it struggles generalizing, while it focuses mostly on the shape itself, largely ignoring the alignment with pixels. Moreover, it performs feed-forward inference, so it cannot refine estimates. We tackle these limitations with a novel framework, called SDFit. To this end, we make three key observations: (1) Learned signed-distance-function (SDF) models act as a strong morphable shape prior. (2) Foundational models embed 2D images and 3D shapes in a joint space, and (3) also infer rich features from images. SDFit exploits these as follows. First, it uses a category-level morphable SDF (mSDF) model, called DIT, to generate 3D shape hypotheses. This mSDF is initialized by querying OpenShape's latent space conditioned on the input image. Then, it computes 2D-to-3D correspondences, by extracting and matching features from the image and mSDF. Last, it fits the mSDF to the image in an render-and-compare fashion, to iteratively refine estimates. We evaluate SDFit on the Pix3D and Pascal3D+ datasets of real-world images. SDFit performs roughly on par with state-of-the-art learned methods, but, uniquely, requires no re-training. Thus, SDFit is promising for generalizing in the wild, paving the way for future research. Code will be released
Related papers
- HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed
Distance Fields [96.04424738803667]
HOISDF is a guided hand-object pose estimation network.
It exploits hand and object SDFs to provide a global, implicit representation over the complete reconstruction volume.
We show that HOISDF achieves state-of-the-art results on hand-object pose estimation benchmarks.
arXiv Detail & Related papers (2024-02-26T22:48:37Z) - WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space [77.92350895927922]
We propose WildFusion, a new approach to 3D-aware image synthesis based on latent diffusion models (LDMs)
Our 3D-aware LDM is trained without any direct supervision from multiview images or 3D geometry.
This opens up promising research avenues for scalable 3D-aware image synthesis and 3D content creation from in-the-wild image data.
arXiv Detail & Related papers (2023-11-22T18:25:51Z) - DDF-HO: Hand-Held Object Reconstruction via Conditional Directed
Distance Field [82.81337273685176]
DDF-HO is a novel approach leveraging Directed Distance Field (DDF) as the shape representation.
We randomly sample multiple rays and collect local to global geometric features for them by introducing a novel 2D ray-based feature aggregation scheme.
Experiments on synthetic and real-world datasets demonstrate that DDF-HO consistently outperforms all baseline methods by a large margin.
arXiv Detail & Related papers (2023-08-16T09:06:32Z) - ARTIC3D: Learning Robust Articulated 3D Shapes from Noisy Web Image
Collections [71.46546520120162]
Estimating 3D articulated shapes like animal bodies from monocular images is inherently challenging.
We propose ARTIC3D, a self-supervised framework to reconstruct per-instance 3D shapes from a sparse image collection in-the-wild.
We produce realistic animations by fine-tuning the rendered shape and texture under rigid part transformations.
arXiv Detail & Related papers (2023-06-07T17:47:50Z) - SDF-3DGAN: A 3D Object Generative Method Based on Implicit Signed
Distance Function [10.199463450025391]
We develop a new method, SDF-3DGAN, for 3D object generation and 3D-Aware image tasks.
We apply SDF for higher quality representation of 3D object in space and design a new SDF neural, which has higher efficiency and higher accuracy.
arXiv Detail & Related papers (2023-03-13T02:48:54Z) - RAFaRe: Learning Robust and Accurate Non-parametric 3D Face
Reconstruction from Pseudo 2D&3D Pairs [13.11105614044699]
We propose a robust and accurate non-parametric method for single-view 3D face reconstruction (SVFR)
A large-scale pseudo 2D&3D dataset is created by first rendering the detailed 3D faces, then swapping the face in the wild images with the rendered face.
Our model outperforms previous methods on FaceScape-wild/lab and MICC benchmarks.
arXiv Detail & Related papers (2023-02-10T19:40:26Z) - Diffusion-SDF: Text-to-Shape via Voxelized Diffusion [90.85011923436593]
We propose a new generative 3D modeling framework called Diffusion-SDF for the challenging task of text-to-shape synthesis.
We show that Diffusion-SDF generates both higher quality and more diversified 3D shapes that conform well to given text descriptions.
arXiv Detail & Related papers (2022-12-06T19:46:47Z) - NeuralODF: Learning Omnidirectional Distance Fields for 3D Shape
Representation [7.208066405543874]
In visual computing, 3D geometry is represented in many different forms including meshes, point clouds, voxel grids, level sets, and depth images.
We propose Omni Distance Fields (ODFs), a new 3D shape representation that encodes geometry by storing the depth to the object's surface from any 3D position in any viewing direction.
arXiv Detail & Related papers (2022-06-12T20:59:26Z) - FIRe: Fast Inverse Rendering using Directional and Signed Distance
Functions [97.5540646069663]
We introduce a novel neural scene representation that we call the directional distance function (DDF)
Our DDF is defined on the unit sphere and predicts the distance to the surface along any given direction.
Based on our DDF, we present a novel fast algorithm (FIRe) to reconstruct 3D shapes given a posed depth map.
arXiv Detail & Related papers (2022-03-30T13:24:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.