Related papers: TextDeformer: Geometry Manipulation using Text Guidance

TextDeformer: Geometry Manipulation using Text Guidance

URL: http://arxiv.org/abs/2304.13348v1
Date: Wed, 26 Apr 2023 07:38:41 GMT
Title: TextDeformer: Geometry Manipulation using Text Guidance
Authors: William Gao, Noam Aigerman, Thibault Groueix, Vladimir G. Kim, Rana Hanocka
Abstract summary: We present a technique for producing a deformation of an input triangle mesh guided solely by a text prompt. Our framework relies on differentiable rendering to connect geometry to powerful pre-trained image encoders, such as CLIP and DINO. To overcome this limitation, we opt to represent our mesh deformation through Jacobians, which updates deformations in a global, smooth manner.
Score: 37.02412892926677
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a technique for automatically producing a deformation of an input triangle mesh, guided solely by a text prompt. Our framework is capable of deformations that produce both large, low-frequency shape changes, and small high-frequency details. Our framework relies on differentiable rendering to connect geometry to powerful pre-trained image encoders, such as CLIP and DINO. Notably, updating mesh geometry by taking gradient steps through differentiable rendering is notoriously challenging, commonly resulting in deformed meshes with significant artifacts. These difficulties are amplified by noisy and inconsistent gradients from CLIP. To overcome this limitation, we opt to represent our mesh deformation through Jacobians, which updates deformations in a global, smooth manner (rather than locally-sub-optimal steps). Our key observation is that Jacobians are a representation that favors smoother, large deformations, leading to a global relation between vertices and pixels, and avoiding localized noisy gradients. Additionally, to ensure the resulting shape is coherent from all 3D viewpoints, we encourage the deep features computed on the 2D encoding of the rendering to be consistent for a given vertex from all viewpoints. We demonstrate that our method is capable of smoothly-deforming a wide variety of source mesh and target text prompts, achieving both large modifications to, e.g., body proportions of animals, as well as adding fine semantic details, such as shoe laces on an army boot and fine details of a face.

Related papers

Geometry in Style: 3D Stylization via Surface Normal Deformation [14.178630551656758]
We present Geometry in Style, a new method for identity-preserving mesh stylization. Existing techniques either adhere to the original shape through overly restrictive deformations such as bump maps. In contrast, we represent a deformation of a triangle mesh as a target normal vector.
arXiv Detail & Related papers (2025-03-29T22:40:25Z)
MeshPad: Interactive Sketch-Conditioned Artist-Designed Mesh Generation and Editing [64.84885028248395]
MeshPad is a generative approach that creates 3D meshes from sketch inputs. We focus on enabling consistent edits by decomposing editing into 'deletion' of regions of a mesh, followed by 'addition' of new mesh geometry. Our approach is based on a triangle sequence-based mesh representation, exploiting a large Transformer model for mesh triangle addition and deletion.
arXiv Detail & Related papers (2025-03-03T11:27:44Z)
ShapeFusion: A 3D diffusion model for localized shape editing [37.82690898932135]
We propose an effective diffusion masking training strategy that, by design, facilitates localized manipulation of any shape region. Compared to the current state-of-the-art our method leads to more interpretable shape manipulations than methods relying on latent code state.
arXiv Detail & Related papers (2024-03-28T18:50:19Z)
T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single Image [84.08705684778666]
We propose a novel Transformer-boosted architecture, named T-Pixel2Mesh, inspired by the coarse-to-fine approach of P2M. Specifically, we use a global Transformer to control the holistic shape and a local Transformer to refine the local geometry details. Our experiments on ShapeNet demonstrate state-of-the-art performance, while results on real-world data show the generalization capability.
arXiv Detail & Related papers (2024-03-20T15:14:22Z)
HeadEvolver: Text to Head Avatars via Expressive and Attribute-Preserving Mesh Deformation [17.590555698266346]
We present HeadEvolver, a novel framework to generate stylized head avatars from text guidance. HeadEvolver uses locally learnable mesh deformation from a template head mesh, producing high-quality digital assets for detail-preserving editing and animation.
arXiv Detail & Related papers (2024-03-14T12:15:23Z)
Robust 3D Tracking with Quality-Aware Shape Completion [67.9748164949519]
We propose a synthetic target representation composed of dense and complete point clouds depicting the target shape precisely by shape completion for robust 3D tracking. Specifically, we design a voxelized 3D tracking framework with shape completion, in which we propose a quality-aware shape completion mechanism to alleviate the adverse effect of noisy historical predictions.
arXiv Detail & Related papers (2023-12-17T04:50:24Z)
Deformation-Guided Unsupervised Non-Rigid Shape Matching [7.327850781641328]
We present an unsupervised data-driven approach for non-rigid shape matching. Our approach is particularly robust when matching digitized shapes using 3D scanners.
arXiv Detail & Related papers (2023-11-27T09:55:55Z)
DragD3D: Realistic Mesh Editing with Rigidity Control Driven by 2D Diffusion Priors [10.355568895429588]
Direct mesh editing and deformation are key components in the geometric modeling and animation pipeline. Regularizers are not aware of the global context and semantics of the object. We show that our deformations can be controlled to yield realistic shape deformations aware of the global context.
arXiv Detail & Related papers (2023-10-06T19:55:40Z)
3Deformer: A Common Framework for Image-Guided Mesh Deformation [27.732389685912214]
Given a source 3D mesh with semantic materials, and a user-specified semantic image, 3Deformer can accurately edit the source mesh. Our 3Deformer is able to produce impressive results and reaches the state-of-the-art level.
arXiv Detail & Related papers (2023-07-19T10:44:44Z)
Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion [115.82306502822412]
StyleGAN has achieved great progress in 2D face reconstruction and semantic editing via image inversion and latent editing. A corresponding generic 3D GAN inversion framework is still missing, limiting the applications of 3D face reconstruction and semantic editing. We study the challenging problem of 3D GAN inversion where a latent code is predicted given a single face image to faithfully recover its 3D shapes and detailed textures.
arXiv Detail & Related papers (2022-12-14T18:49:50Z)
Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian [58.704089101826774]
We present a 3D-aware image deformation method with minimal restrictions on shape category and deformation type. We take a supervised learning-based approach to predict the shape Laplacian of the underlying volume of a 3D reconstruction represented as a point cloud. In the experiments, we present our results of deforming 2D character and clothed human images.
arXiv Detail & Related papers (2022-03-29T04:57:18Z)
Learning Skeletal Articulations with Neural Blend Shapes [57.879030623284216]
We develop a neural technique for articulating 3D characters using enveloping with a pre-defined skeletal structure. Our framework learns to rig and skin characters with the same articulation structure. We propose neural blend shapes which improve the deformation quality in the joint regions.
arXiv Detail & Related papers (2021-05-06T05:58:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.