Controllable Shadow Generation with Single-Step Diffusion Models from Synthetic Data
- URL: http://arxiv.org/abs/2412.11972v1
- Date: Mon, 16 Dec 2024 16:55:22 GMT
- Title: Controllable Shadow Generation with Single-Step Diffusion Models from Synthetic Data
- Authors: Onur Tasar, Clément Chadebec, Benjamin Aubin,
- Abstract summary: We introduce a novel method for fast, controllable, and background-free shadow generation for 2D object images.
We create a large synthetic dataset using a 3D rendering engine to train a diffusion model for controllable shadow generation.
We find that rectified flow objective achieves high-quality results with just a single sampling step enabling real-time applications.
- Score: 7.380444448047908
- License:
- Abstract: Realistic shadow generation is a critical component for high-quality image compositing and visual effects, yet existing methods suffer from certain limitations: Physics-based approaches require a 3D scene geometry, which is often unavailable, while learning-based techniques struggle with control and visual artifacts. We introduce a novel method for fast, controllable, and background-free shadow generation for 2D object images. We create a large synthetic dataset using a 3D rendering engine to train a diffusion model for controllable shadow generation, generating shadow maps for diverse light source parameters. Through extensive ablation studies, we find that rectified flow objective achieves high-quality results with just a single sampling step enabling real-time applications. Furthermore, our experiments demonstrate that the model generalizes well to real-world images. To facilitate further research in evaluating quality and controllability in shadow generation, we release a new public benchmark containing a diverse set of object images and shadow maps in various settings. The project page is available at https://gojasper.github.io/controllable-shadow-generation-project/
Related papers
- 3D Object Manipulation in a Single Image using Generative Models [30.241857090353864]
We introduce textbfOMG3D, a novel framework that integrates the precise geometric control with the generative power of diffusion models.
Our framework first converts 2D objects into 3D, enabling user-directed modifications and lifelike motions at the geometric level.
Remarkably, all these steps can be done using one NVIDIA 3090.
arXiv Detail & Related papers (2025-01-22T15:06:30Z) - GenLit: Reformulating Single-Image Relighting as Video Generation [44.409962561291216]
We introduce GenLit, a framework that distills the ability of a graphics engine to perform light manipulation into a video generation model.
We find that a model fine-tuned on only a small synthetic dataset is able to generalize to real images.
arXiv Detail & Related papers (2024-12-15T15:40:40Z) - GeoGen: Geometry-Aware Generative Modeling via Signed Distance Functions [22.077366472693395]
We introduce a new generative approach for synthesizing 3D geometry and images from single-view collections.
By employing volumetric rendering using neural radiance fields, they inherit a key limitation: the generated geometry is noisy and unconstrained.
We propose GeoGen, a new SDF-based 3D generative model trained in an end-to-end manner.
arXiv Detail & Related papers (2024-06-06T17:00:10Z) - ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models [65.22994156658918]
We present a method that learns to generate multi-view images in a single denoising process from real-world data.
We design an autoregressive generation that renders more 3D-consistent images at any viewpoint.
arXiv Detail & Related papers (2024-03-04T07:57:05Z) - Controllable Shadow Generation Using Pixel Height Maps [58.59256060452418]
Physics-based shadow rendering methods require 3D geometries, which are not always available.
Deep learning-based shadow synthesis methods learn a mapping from the light information to an object's shadow without explicitly modeling the shadow geometry.
We introduce pixel heigh, a novel geometry representation that encodes the correlations between objects, ground, and camera pose.
arXiv Detail & Related papers (2022-07-12T08:29:51Z) - Progressively-connected Light Field Network for Efficient View Synthesis [69.29043048775802]
We present a Progressively-connected Light Field network (ProLiF) for the novel view synthesis of complex forward-facing scenes.
ProLiF encodes a 4D light field, which allows rendering a large batch of rays in one training step for image- or patch-level losses.
arXiv Detail & Related papers (2022-07-10T13:47:20Z) - Extracting Triangular 3D Models, Materials, and Lighting From Images [59.33666140713829]
We present an efficient method for joint optimization of materials and lighting from multi-view image observations.
We leverage meshes with spatially-varying materials and environment that can be deployed in any traditional graphics engine.
arXiv Detail & Related papers (2021-11-24T13:58:20Z) - A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware
Image Synthesis [163.96778522283967]
We propose a shading-guided generative implicit model that is able to learn a starkly improved shape representation.
An accurate 3D shape should also yield a realistic rendering under different lighting conditions.
Our experiments on multiple datasets show that the proposed approach achieves photorealistic 3D-aware image synthesis.
arXiv Detail & Related papers (2021-10-29T10:53:12Z) - Towards Realistic 3D Embedding via View Alignment [53.89445873577063]
This paper presents an innovative View Alignment GAN (VA-GAN) that composes new images by embedding 3D models into 2D background images realistically and automatically.
VA-GAN consists of a texture generator and a differential discriminator that are inter-connected and end-to-end trainable.
arXiv Detail & Related papers (2020-07-14T14:45:00Z) - GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis [43.4859484191223]
We propose a generative model for radiance fields which have recently proven successful for novel view synthesis of a single scene.
By introducing a multi-scale patch-based discriminator, we demonstrate synthesis of high-resolution images while training our model from unposed 2D images alone.
arXiv Detail & Related papers (2020-07-05T20:37:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.