DiffCamera: Arbitrary Refocusing on Images
- URL: http://arxiv.org/abs/2509.26599v1
- Date: Tue, 30 Sep 2025 17:48:23 GMT
- Title: DiffCamera: Arbitrary Refocusing on Images
- Authors: Yiyang Wang, Xi Chen, Xiaogang Xu, Yu Liu, Hengshuang Zhao,
- Abstract summary: We propose DiffCamera, a model that enables flexible refocusing of a created image conditioned on an arbitrary new focus point and a blur level.<n>Experiments demonstrate that DiffCamera supports stable refocusing across a wide range of scenes, providing unprecedented control over DoF adjustments for photography and generative AI applications.
- Score: 55.948229011478304
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The depth-of-field (DoF) effect, which introduces aesthetically pleasing blur, enhances photographic quality but is fixed and difficult to modify once the image has been created. This becomes problematic when the applied blur is undesirable~(e.g., the subject is out of focus). To address this, we propose DiffCamera, a model that enables flexible refocusing of a created image conditioned on an arbitrary new focus point and a blur level. Specifically, we design a diffusion transformer framework for refocusing learning. However, the training requires pairs of data with different focus planes and bokeh levels in the same scene, which are hard to acquire. To overcome this limitation, we develop a simulation-based pipeline to generate large-scale image pairs with varying focus planes and bokeh levels. With the simulated data, we find that training with only a vanilla diffusion objective often leads to incorrect DoF behaviors due to the complexity of the task. This requires a stronger constraint during training. Inspired by the photographic principle that photos of different focus planes can be linearly blended into a multi-focus image, we propose a stacking constraint during training to enforce precise DoF manipulation. This constraint enhances model training by imposing physically grounded refocusing behavior that the focusing results should be faithfully aligned with the scene structure and the camera conditions so that they can be combined into the correct multi-focus image. We also construct a benchmark to evaluate the effectiveness of our refocusing model. Extensive experiments demonstrate that DiffCamera supports stable refocusing across a wide range of scenes, providing unprecedented control over DoF adjustments for photography and generative AI applications.
Related papers
- Generative Refocusing: Flexible Defocus Control from a Single Image [12.798805351731668]
We introduce Generative Refocusing, a two-step process that uses DeNet to recover all-in-focus images from various inputs and BokehNet for creating controllable bokeh.<n>Our experiments show we achieve top performance in defocus deblurring, bokeh synthesis, and refocusing benchmarks.
arXiv Detail & Related papers (2025-12-18T18:59:59Z) - Fine-grained Defocus Blur Control for Generative Image Models [66.30016220484394]
Current text-to-image diffusion models excel at generating diverse, high-quality images.<n>We introduce a novel text-to-image diffusion framework that leverages camera metadata.<n>Our model enables superior fine-grained control without altering the depicted scene.
arXiv Detail & Related papers (2025-10-07T17:59:15Z) - BokehDiff: Neural Lens Blur with One-Step Diffusion [53.11429878683807]
We introduce BokehDiff, a lens blur rendering method that achieves physically accurate and visually appealing outcomes.<n>Our method employs a physics-inspired self-attention module that aligns with the image formation process.<n>We adapt the diffusion model to the one-step inference scheme without introducing additional noise, and achieve results of high quality and fidelity.
arXiv Detail & Related papers (2025-07-24T03:23:19Z) - Bokeh Diffusion: Defocus Blur Control in Text-to-Image Diffusion Models [26.79219274697864]
Bokeh Diffusion is a scene-consistent bokeh control framework.<n>We introduce a hybrid training pipeline that aligns in-the-wild images with synthetic blur augmentations.<n>Our approach enables flexible, lens-like blur control, supports downstream applications such as real image editing via inversion.
arXiv Detail & Related papers (2025-03-11T13:49:12Z) - Reblurring-Guided Single Image Defocus Deblurring: A Learning Framework with Misaligned Training Pairs [65.25002116216771]
We introduce a reblurring-guided learning framework for single image defocus deblurring.<n>Our reblurring module ensures spatial consistency between the deblurred image, the reblurred image and the input blurry image.<n> spatially variant blur can be derived from the reblurring module, and serve as pseudo supervision for defocus blur map during training.
arXiv Detail & Related papers (2024-09-26T12:37:50Z) - DOF-GS:Adjustable Depth-of-Field 3D Gaussian Splatting for Post-Capture Refocusing, Defocus Rendering and Blur Removal [42.427021878005405]
We introduce DOF-GS, a new 3DGS-based framework with a finite-aperture camera model and explicit, differentiable defocus rendering.<n>Results demonstrate that DOF-GS supports post-capture refocusing, adjustable defocus and high-quality all-in-focus rendering.
arXiv Detail & Related papers (2024-05-27T16:54:49Z) - Learning Single Image Defocus Deblurring with Misaligned Training Pairs [80.13320797431487]
We propose a joint deblurring and reblurring learning framework for single image defocus deblurring.
Our framework can be applied to boost defocus deblurring networks in terms of both quantitative metrics and visual quality.
arXiv Detail & Related papers (2022-11-26T07:36:33Z) - Deep Depth from Focal Stack with Defocus Model for Camera-Setting
Invariance [19.460887007137607]
We propose a learning-based depth from focus/defocus (DFF) which takes a focal stack as input for estimating scene depth.
We show that our method is robust against a synthetic-to-real domain gap, and exhibits state-of-the-art performance.
arXiv Detail & Related papers (2022-02-26T04:21:08Z) - Single image deep defocus estimation and its applications [82.93345261434943]
We train a deep neural network to classify image patches into one of the 20 levels of blurriness.
The trained model is used to determine the patch blurriness which is then refined by applying an iterative weighted guided filter.
The result is a defocus map that carries the information of the degree of blurriness for each pixel.
arXiv Detail & Related papers (2021-07-30T06:18:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.