PartStickers: Generating Parts of Objects for Rapid Prototyping
- URL: http://arxiv.org/abs/2504.05508v1
- Date: Mon, 07 Apr 2025 21:07:17 GMT
- Title: PartStickers: Generating Parts of Objects for Rapid Prototyping
- Authors: Mo Zhou, Josh Myers-Dean, Danna Gurari,
- Abstract summary: prototyping often requires specific parts of objects, such as when constructing a novel creature for a video game.<n>Existing text-to-image methods tend to only generate entire objects.<n>We propose a novel task and method of part sticker generation", which entails generating an isolated part of an object on a neutral background.
- Score: 25.550688383520807
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Design prototyping involves creating mockups of products or concepts to gather feedback and iterate on ideas. While prototyping often requires specific parts of objects, such as when constructing a novel creature for a video game, existing text-to-image methods tend to only generate entire objects. To address this, we propose a novel task and method of ``part sticker generation", which entails generating an isolated part of an object on a neutral background. Experiments demonstrate our method outperforms state-of-the-art baselines with respect to realism and text alignment, while preserving object-level generation capabilities. We publicly share our code and models to encourage community-wide progress on this new task: https://partsticker.github.io.
Related papers
- SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects [20.978091381109294]
We propose a method to generate articulated objects from a single image.
Our method generates an articulated object that is visually consistent with the input image.
Our experiments show that our method outperforms the state-of-the-art in articulated object creation.
arXiv Detail & Related papers (2024-10-21T20:41:32Z) - PartCraft: Crafting Creative Objects by Parts [128.30514851911218]
This paper propels creative control in generative visual AI by allowing users to "select"
We for the first time allow users to choose visual concepts by parts for their creative endeavors.
Fine-grained generation that precisely captures selected visual concepts.
arXiv Detail & Related papers (2024-07-05T15:53:04Z) - FaithFill: Faithful Inpainting for Object Completion Using a Single Reference Image [6.742568054626032]
FaithFill is a diffusion-based inpainting approach for realistic generation of missing object parts.
We demonstrate that FaithFill produces faithful generation of the object's missing parts, together with background/scene preservation, from a single reference image.
arXiv Detail & Related papers (2024-06-12T04:45:33Z) - Customizing Text-to-Image Diffusion with Object Viewpoint Control [53.621518249820745]
We introduce a new task -- enabling explicit control of the object viewpoint in the customization of text-to-image diffusion models.
This allows us to modify the custom object's properties and generate it in various background scenes via text prompts.
We propose to condition the diffusion process on the 3D object features rendered from the target viewpoint.
arXiv Detail & Related papers (2024-04-18T16:59:51Z) - Object-Driven One-Shot Fine-tuning of Text-to-Image Diffusion with
Prototypical Embedding [7.893308498886083]
Our proposed method aims to address the challenges of generalizability and fidelity in an object-driven way.
A prototypical embedding is based on the object's appearance and its class, before fine-tuning the diffusion model.
Our method outperforms several existing works.
arXiv Detail & Related papers (2024-01-28T17:11:42Z) - LLM Blueprint: Enabling Text-to-Image Generation with Complex and
Detailed Prompts [60.54912319612113]
Diffusion-based generative models have significantly advanced text-to-image generation but encounter challenges when processing lengthy and intricate text prompts.
We present a novel approach leveraging Large Language Models (LLMs) to extract critical components from text prompts.
Our evaluation on complex prompts featuring multiple objects demonstrates a substantial improvement in recall compared to baseline diffusion models.
arXiv Detail & Related papers (2023-10-16T17:57:37Z) - Multi-object Video Generation from Single Frame Layouts [84.55806837855846]
We propose a video generative framework capable of synthesizing global scenes with local objects.
Our framework is a non-trivial adaptation from image generation methods, and is new to this field.
Our model has been evaluated on two widely-used video recognition benchmarks.
arXiv Detail & Related papers (2023-05-06T09:07:01Z) - Taming Encoder for Zero Fine-tuning Image Customization with
Text-to-Image Diffusion Models [55.04969603431266]
This paper proposes a method for generating images of customized objects specified by users.
The method is based on a general framework that bypasses the lengthy optimization required by previous approaches.
We demonstrate through experiments that our proposed method is able to synthesize images with compelling output quality, appearance diversity, and object fidelity.
arXiv Detail & Related papers (2023-04-05T17:59:32Z) - Part-aware Prototype Network for Few-shot Semantic Segmentation [50.581647306020095]
We propose a novel few-shot semantic segmentation framework based on the prototype representation.
Our key idea is to decompose the holistic class representation into a set of part-aware prototypes.
We develop a novel graph neural network model to generate and enhance the proposed part-aware prototypes.
arXiv Detail & Related papers (2020-07-13T11:03:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.