Integrating View Conditions for Image Synthesis
- URL: http://arxiv.org/abs/2310.16002v3
- Date: Wed, 8 May 2024 08:25:50 GMT
- Title: Integrating View Conditions for Image Synthesis
- Authors: Jinbin Bai, Zhen Dong, Aosong Feng, Xiao Zhang, Tian Ye, Kaicheng Zhou,
- Abstract summary: This paper introduces a pioneering framework that integrates viewpoint information to enhance the control of image editing tasks.
We distill three essential criteria -- consistency, controllability, and harmony -- that should be met for an image editing method.
- Score: 14.738884513493227
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the field of image processing, applying intricate semantic modifications within existing images remains an enduring challenge. This paper introduces a pioneering framework that integrates viewpoint information to enhance the control of image editing tasks, especially for interior design scenes. By surveying existing object editing methodologies, we distill three essential criteria -- consistency, controllability, and harmony -- that should be met for an image editing method. In contrast to previous approaches, our framework takes the lead in satisfying all three requirements for addressing the challenge of image synthesis. Through comprehensive experiments, encompassing both quantitative assessments and qualitative comparisons with contemporary state-of-the-art methods, we present compelling evidence of our framework's superior performance across multiple dimensions. This work establishes a promising avenue for advancing image synthesis techniques and empowering precise object modifications while preserving the visual coherence of the entire composition.
Related papers
- Consolidating Attention Features for Multi-view Image Editing [126.19731971010475]
We focus on spatial control-based geometric manipulations and introduce a method to consolidate the editing process across various views.
We introduce QNeRF, a neural radiance field trained on the internal query features of the edited images.
We refine the process through a progressive, iterative method that better consolidates queries across the diffusion timesteps.
arXiv Detail & Related papers (2024-02-22T18:50:18Z) - Unified Diffusion-Based Rigid and Non-Rigid Editing with Text and Image
Guidance [15.130419159003816]
We present a versatile image editing framework capable of executing both rigid and non-rigid edits.
We leverage a dual-path injection scheme to handle diverse editing scenarios.
We introduce an integrated self-attention mechanism for fusion of appearance and structural information.
arXiv Detail & Related papers (2024-01-04T08:21:30Z) - Unifying Correspondence, Pose and NeRF for Pose-Free Novel View Synthesis from Stereo Pairs [57.492124844326206]
This work delves into the task of pose-free novel view synthesis from stereo pairs, a challenging and pioneering task in 3D vision.
Our innovative framework, unlike any before, seamlessly integrates 2D correspondence matching, camera pose estimation, and NeRF rendering, fostering a synergistic enhancement of these tasks.
arXiv Detail & Related papers (2023-12-12T13:22:44Z) - Layered Rendering Diffusion Model for Zero-Shot Guided Image Synthesis [60.260724486834164]
This paper introduces innovative solutions to enhance spatial controllability in diffusion models reliant on text queries.
We present two key innovations: Vision Guidance and the Layered Rendering Diffusion framework.
We apply our method to three practical applications: bounding box-to-image, semantic mask-to-image and image editing.
arXiv Detail & Related papers (2023-11-30T10:36:19Z) - Zero-Shot Image Harmonization with Generative Model Prior [22.984119094424056]
We propose a zero-shot approach to image harmonization, aiming to overcome the reliance on large amounts of synthetic composite images.
We introduce a fully modularized framework inspired by human behavior.
We present compelling visual results across diverse scenes and objects, along with a user study validating our approach.
arXiv Detail & Related papers (2023-07-17T00:56:21Z) - Cones 2: Customizable Image Synthesis with Multiple Subjects [50.54010141032032]
We study how to efficiently represent a particular subject as well as how to appropriately compose different subjects.
By rectifying the activations in the cross-attention map, the layout appoints and separates the location of different subjects in the image.
arXiv Detail & Related papers (2023-05-30T18:00:06Z) - Taming Encoder for Zero Fine-tuning Image Customization with
Text-to-Image Diffusion Models [55.04969603431266]
This paper proposes a method for generating images of customized objects specified by users.
The method is based on a general framework that bypasses the lengthy optimization required by previous approaches.
We demonstrate through experiments that our proposed method is able to synthesize images with compelling output quality, appearance diversity, and object fidelity.
arXiv Detail & Related papers (2023-04-05T17:59:32Z) - Guided Co-Modulated GAN for 360{\deg} Field of View Extrapolation [15.850166450573756]
We propose a method to extrapolate a 360deg field of view from a single image.
Our method obtains state-of-the-art results and outperforms previous methods on standard image quality metrics.
arXiv Detail & Related papers (2022-04-15T01:48:35Z) - Modeling Artistic Workflows for Image Generation and Editing [83.43047077223947]
We propose a generative model that follows a given artistic workflow.
It enables both multi-stage image generation as well as multi-stage image editing of an existing piece of art.
arXiv Detail & Related papers (2020-07-14T17:54:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.