Related papers: Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image Generation

Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image Generation

URL: http://arxiv.org/abs/2304.13681v2
Date: Mon, 4 Sep 2023 23:02:18 GMT
Title: Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image Generation
Authors: Eric Ming Chen, Sidhanth Holalkere, Ruyu Yan, Kai Zhang, Abe Davis
Abstract summary: We propose ray conditioning, a geometry-free alternative that relaxes the photo-consistency constraint. Our method generates multi-view images by conditioning a 2D GAN on a light field prior. With explicit viewpoint control, state-of-the-art photo-realism and identity consistency, our method is particularly suited for the viewpoint editing task.
Score: 10.300893339754827
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multi-view image generation attracts particular attention these days due to its promising 3D-related applications, e.g., image viewpoint editing. Most existing methods follow a paradigm where a 3D representation is first synthesized, and then rendered into 2D images to ensure photo-consistency across viewpoints. However, such explicit bias for photo-consistency sacrifices photo-realism, causing geometry artifacts and loss of fine-scale details when these methods are applied to edit real images. To address this issue, we propose ray conditioning, a geometry-free alternative that relaxes the photo-consistency constraint. Our method generates multi-view images by conditioning a 2D GAN on a light field prior. With explicit viewpoint control, state-of-the-art photo-realism and identity consistency, our method is particularly suited for the viewpoint editing task.

Related papers

GeoDiff4D: Geometry-Aware Diffusion for 4D Head Avatar Reconstruction [49.70452913749897]
We propose a novel framework that leverages geometry-aware diffusion to learn strong geometry priors for high-fidelity head avatar reconstruction.<n>Our approach jointly synthesizes portrait images and corresponding surface normals, while a pose-free expression captures implicit expression representations.<n>Our method substantially outperforms state-of-the-art approaches in visual quality, expression fidelity, and cross-identity generalization.
arXiv Detail & Related papers (2026-02-27T16:41:21Z)
FreeInsert: Personalized Object Insertion with Geometric and Style Control [26.088650452374726]
We propose a training-free framework that customizes object insertion into arbitrary scenes by leveraging 3D geometric information.<n>The rendered image, serving as geometric control, is combined with style and content control achieved through diffusion adapters.
arXiv Detail & Related papers (2025-09-25T05:26:10Z)
LACONIC: A 3D Layout Adapter for Controllable Image Creation [22.96293773013579]
Existing generative approaches for guided image synthesis rely on 2D controls in the image or text space.<n>We propose a novel conditioning approach, training method and adapter network that can be plugged into pretrained text-to-image diffusion models.<n>Our method supports camera control, conditioning geometries on explicit 3D and, for the first time, accounts for the entire context of a scene.
arXiv Detail & Related papers (2025-07-04T02:25:36Z)
Personalize Your Gaussian: Consistent 3D Scene Personalization from a Single Image [56.134832639494185]
We present Consistent Personalization for 3D Gaussian Splatting (CP-GS), a framework that propagates the single-view reference appearance to novel perspectives.<n>In particular, CP-GS integrates pre-trained image-to-3D generation and iterative LoRA fine-tuning to extract and extend the reference appearance.
arXiv Detail & Related papers (2025-05-20T15:55:53Z)
Geometry-Aware Diffusion Models for Multiview Scene Inpainting [24.963896970130065]
We focus on 3D scene inpainting, where parts of an input image set, captured from different viewpoints, are masked out. Most recent work addresses this challenge by combining generative models with a 3D radiance field to fuse information across viewpoints. We introduce a geometry-aware conditional generative model, capable of multi-view consistent inpainting.
arXiv Detail & Related papers (2025-02-18T23:30:10Z)
View-consistent Object Removal in Radiance Fields [14.195400035176815]
Radiance Fields (RFs) have emerged as a crucial technology for 3D scene representation. Current methods rely on per-frame 2D image inpainting, which often fails to maintain consistency across views. We introduce a novel RF editing pipeline that significantly enhances consistency by requiring the inpainting of only a single reference image.
arXiv Detail & Related papers (2024-08-04T17:57:23Z)
Lite2Relight: 3D-aware Single Image Portrait Relighting [87.62069509622226]
Lite2Relight is a novel technique that can predict 3D consistent head poses of portraits. By utilizing a pre-trained geometry-aware encoder and a feature alignment module, we map input images into a relightable 3D space. This includes producing 3D-consistent results of the full head, including hair, eyes, and expressions.
arXiv Detail & Related papers (2024-07-15T07:16:11Z)
Diffusion Models are Geometry Critics: Single Image 3D Editing Using Pre-Trained Diffusion Priors [24.478875248825563]
We propose a novel image editing technique that enables 3D manipulations on single images. Our method directly leverages powerful image diffusion models trained on a broad spectrum of text-image pairs. Our method can generate high-quality 3D-aware image edits with large viewpoint transformations and high appearance and shape consistency with the input image.
arXiv Detail & Related papers (2024-03-18T06:18:59Z)
GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing [38.948892064761914]
GaussCtrl is a text-driven method to edit a 3D scene reconstructed by the 3D Gaussian Splatting (3DGS) Our key contribution is multi-view consistent editing, which enables editing all images together instead of iteratively editing one image.
arXiv Detail & Related papers (2024-03-13T17:35:28Z)
Consolidating Attention Features for Multi-view Image Editing [126.19731971010475]
We focus on spatial control-based geometric manipulations and introduce a method to consolidate the editing process across various views. We introduce QNeRF, a neural radiance field trained on the internal query features of the edited images. We refine the process through a progressive, iterative method that better consolidates queries across the diffusion timesteps.
arXiv Detail & Related papers (2024-02-22T18:50:18Z)
Explicitly Controllable 3D-Aware Portrait Generation [42.30481422714532]
We propose a 3D portrait generation network that produces consistent portraits according to semantic parameters regarding pose, identity, expression and lighting. Our method outperforms prior arts in extensive experiments, producing realistic portraits with vivid expression in natural lighting when viewed in free viewpoint.
arXiv Detail & Related papers (2022-09-12T17:40:08Z)
GAN2X: Non-Lambertian Inverse Rendering of Image GANs [85.76426471872855]
We present GAN2X, a new method for unsupervised inverse rendering that only uses unpaired images for training. Unlike previous Shape-from-GAN approaches that mainly focus on 3D shapes, we take the first attempt to also recover non-Lambertian material properties by exploiting the pseudo paired data generated by a GAN. Experiments demonstrate that GAN2X can accurately decompose 2D images to 3D shape, albedo, and specular properties for different object categories, and achieves the state-of-the-art performance for unsupervised single-view 3D face reconstruction.
arXiv Detail & Related papers (2022-06-18T16:58:49Z)
Self-Supervised Image Representation Learning with Geometric Set Consistency [50.12720780102395]
We propose a method for self-supervised image representation learning under the guidance of 3D geometric consistency. Specifically, we introduce 3D geometric consistency into a contrastive learning framework to enforce the feature consistency within image views.
arXiv Detail & Related papers (2022-03-29T08:57:33Z)
3D GAN Inversion for Controllable Portrait Image Animation [45.55581298551192]
We leverage newly developed 3D GANs, which allow explicit control over the pose of the image subject with multi-view consistency. The proposed technique for portrait image animation outperforms previous methods in terms of image quality, identity preservation, and pose transfer.
arXiv Detail & Related papers (2022-03-25T04:06:06Z)
Towards Realistic 3D Embedding via View Alignment [53.89445873577063]
This paper presents an innovative View Alignment GAN (VA-GAN) that composes new images by embedding 3D models into 2D background images realistically and automatically. VA-GAN consists of a texture generator and a differential discriminator that are inter-connected and end-to-end trainable.
arXiv Detail & Related papers (2020-07-14T14:45:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.