Text-guided Image-and-Shape Editing and Generation: A Short Survey
- URL: http://arxiv.org/abs/2304.09244v1
- Date: Tue, 18 Apr 2023 19:11:36 GMT
- Title: Text-guided Image-and-Shape Editing and Generation: A Short Survey
- Authors: Cheng-Kang Ted Chao and Yotam Gingold
- Abstract summary: In the recent advance of machine learning, artists' editing intents can even be driven by text.
In this short survey, we provide an overview over 50 papers on state-of-the-art (text-guided) image-and-shape generation techniques.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image and shape editing are ubiquitous among digital artworks. Graphics
algorithms facilitate artists and designers to achieve desired editing intents
without going through manually tedious retouching. In the recent advance of
machine learning, artists' editing intents can even be driven by text, using a
variety of well-trained neural networks. They have seen to be receiving an
extensive success on such as generating photorealistic images, artworks and
human poses, stylizing meshes from text, or auto-completion given image and
shape priors. In this short survey, we provide an overview over 50 papers on
state-of-the-art (text-guided) image-and-shape generation techniques. We start
with an overview on recent editing algorithms in the introduction. Then, we
provide a comprehensive review on text-guided editing techniques for 2D and 3D
independently, where each of its sub-section begins with a brief background
introduction. We also contextualize editing algorithms under recent implicit
neural representations. Finally, we conclude the survey with the discussion
over existing methods and potential research ideas.
Related papers
- A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models [117.77807994397784]
Image editing aims to edit the given synthetic or real image to meet the specific requirements from users.
Recent significant advancement in this field is based on the development of text-to-image (T2I) diffusion models.
T2I-based image editing methods significantly enhance editing performance and offer a user-friendly interface for modifying content guided by multimodal inputs.
arXiv Detail & Related papers (2024-06-20T17:58:52Z) - Fashion Style Editing with Generative Human Prior [9.854813629782681]
In this work, we aim to manipulate the fashion style of human imagery using text descriptions.
Specifically, we leverage a generative human prior and achieve fashion style editing by navigating its learned latent space.
Our framework successfully projects abstract fashion concepts onto human images and introduces exciting new applications to the field.
arXiv Detail & Related papers (2024-04-02T14:22:04Z) - Text-Driven Image Editing via Learnable Regions [74.45313434129005]
We introduce a method for region-based image editing driven by textual prompts, without the need for user-provided masks or sketches.
We show that this simple approach enables flexible editing that is compatible with current image generation models.
Experiments demonstrate the competitive performance of our method in manipulating images with high fidelity and realism that correspond to the provided language descriptions.
arXiv Detail & Related papers (2023-11-28T02:27:31Z) - Editing 3D Scenes via Text Prompts without Retraining [80.57814031701744]
DN2N is a text-driven editing method that allows for the direct acquisition of a NeRF model with universal editing capabilities.
Our method employs off-the-shelf text-based editing models of 2D images to modify the 3D scene images.
Our method achieves multiple editing types, including but not limited to appearance editing, weather transition, material changing, and style transfer.
arXiv Detail & Related papers (2023-09-10T02:31:50Z) - SKED: Sketch-guided Text-based 3D Editing [49.019881133348775]
We present SKED, a technique for editing 3D shapes represented by NeRFs.
Our technique utilizes as few as two guiding sketches from different views to alter an existing neural field.
We propose novel loss functions to generate the desired edits while preserving the density and radiance of the base instance.
arXiv Detail & Related papers (2023-03-19T18:40:44Z) - Zero-shot Image-to-Image Translation [57.46189236379433]
We propose pix2pix-zero, an image-to-image translation method that can preserve the original image without manual prompting.
We propose cross-attention guidance, which aims to retain the cross-attention maps of the input image throughout the diffusion process.
Our method does not need additional training for these edits and can directly use the existing text-to-image diffusion model.
arXiv Detail & Related papers (2023-02-06T18:59:51Z) - Exploring Stroke-Level Modifications for Scene Text Editing [86.33216648792964]
Scene text editing (STE) aims to replace text with the desired one while preserving background and styles of the original text.
Previous methods of editing the whole image have to learn different translation rules of background and text regions simultaneously.
We propose a novel network by MOdifying Scene Text image at strokE Level (MOSTEL)
arXiv Detail & Related papers (2022-12-05T02:10:59Z) - A Taxonomy of Prompt Modifiers for Text-To-Image Generation [6.903929927172919]
This paper identifies six types of prompt modifier used by practitioners in the online community based on a 3-month ethnography study.
The novel taxonomy of prompt modifier provides researchers a conceptual starting point for investigating the practice of text-to-image generation.
We discuss research opportunities of this novel creative practice in the field of Human-Computer Interaction.
arXiv Detail & Related papers (2022-04-20T06:15:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.