InstructHumans: Editing Animated 3D Human Textures with Instructions
- URL: http://arxiv.org/abs/2404.04037v1
- Date: Fri, 5 Apr 2024 11:45:03 GMT
- Title: InstructHumans: Editing Animated 3D Human Textures with Instructions
- Authors: Jiayin Zhu, Linlin Yang, Angela Yao,
- Abstract summary: We present InstructHumans, a novel framework for instruction-driven 3D human texture editing.
InstructHumans significantly outperforms existing 3D editing methods, consistent with the initial avatar.
- Score: 40.012406098563204
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present InstructHumans, a novel framework for instruction-driven 3D human texture editing. Existing text-based editing methods use Score Distillation Sampling (SDS) to distill guidance from generative models. This work shows that naively using such scores is harmful to editing as they destroy consistency with the source avatar. Instead, we propose an alternate SDS for Editing (SDS-E) that selectively incorporates subterms of SDS across diffusion timesteps. We further enhance SDS-E with spatial smoothness regularization and gradient-based viewpoint sampling to achieve high-quality edits with sharp and high-fidelity detailing. InstructHumans significantly outperforms existing 3D editing methods, consistent with the initial avatar while faithful to the textual instructions. Project page: https://jyzhu.top/instruct-humans .
Related papers
- Semantic Score Distillation Sampling for Compositional Text-to-3D Generation [28.88237230872795]
Generating high-quality 3D assets from textual descriptions remains a pivotal challenge in computer graphics and vision research.
We introduce a novel SDS approach, designed to improve the expressiveness and accuracy of compositional text-to-3D generation.
Our approach integrates new semantic embeddings that maintain consistency across different rendering views.
By leveraging explicit semantic guidance, our method unlocks the compositional capabilities of existing pre-trained diffusion models.
arXiv Detail & Related papers (2024-10-11T17:26:00Z) - Revealing Directions for Text-guided 3D Face Editing [52.85632020601518]
3D face editing is a significant task in multimedia, aimed at the manipulation of 3D face models across various control signals.
We present Face Clan, a text-general approach for generating and manipulating 3D faces based on arbitrary attribute descriptions.
Our method offers a precisely controllable manipulation method, allowing users to intuitively customize regions of interest with the text description.
arXiv Detail & Related papers (2024-10-07T12:04:39Z) - Preserving Identity with Variational Score for General-purpose 3D Editing [48.314327790451856]
Piva is a novel optimization-based method for editing images and 3D models based on diffusion models.
We pinpoint the limitations in 2D and 3D editing, which causes detail loss and oversaturation.
We propose an additional score distillation term that enforces identity preservation.
arXiv Detail & Related papers (2024-06-13T09:32:40Z) - GSEdit: Efficient Text-Guided Editing of 3D Objects via Gaussian Splatting [10.527349772993796]
We present GSEdit, a pipeline for text-guided 3D object editing based on Gaussian Splatting models.
Our method enables the editing of the style and appearance of 3D objects without altering their main details, all in a matter of minutes on consumer hardware.
arXiv Detail & Related papers (2024-03-08T08:42:23Z) - PaintHuman: Towards High-fidelity Text-to-3D Human Texturing via
Denoised Score Distillation [89.09455618184239]
Recent advances in text-to-3D human generation have been groundbreaking.
We propose a model called PaintHuman to address the challenges from two aspects.
We use the depth map as a guidance to ensure realistic semantically aligned textures.
arXiv Detail & Related papers (2023-10-14T00:37:16Z) - Directional Texture Editing for 3D Models [51.31499400557996]
ITEM3D is designed for automatic textbf3D object editing according to the text textbfInstructions.
Leveraging the diffusion models and the differentiable rendering, ITEM3D takes the rendered images as the bridge of text and 3D representation.
arXiv Detail & Related papers (2023-09-26T12:01:13Z) - Delta Denoising Score [51.98288453616375]
We introduce Delta Denoising Score (DDS), a novel scoring function for text-based image editing.
It guides minimal modifications of an input image towards the content described in a target prompt.
arXiv Detail & Related papers (2023-04-14T12:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.