Learning to Refocus with Video Diffusion Models
- URL: http://arxiv.org/abs/2512.19823v3
- Date: Mon, 29 Dec 2025 17:04:36 GMT
- Title: Learning to Refocus with Video Diffusion Models
- Authors: SaiKiran Tedla, Zhoutong Zhang, Xuaner Zhang, Shumian Xin,
- Abstract summary: We introduce a novel method for realistic post-capture refocusing using video diffusion models.<n>From a single defocused image, our approach generates a perceptually accurate focal stack, represented as a video sequence.<n>Our method consistently outperforms existing approaches in both perceptual quality and robustness across challenging scenarios.
- Score: 10.749713029715226
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Focus is a cornerstone of photography, yet autofocus systems often fail to capture the intended subject, and users frequently wish to adjust focus after capture. We introduce a novel method for realistic post-capture refocusing using video diffusion models. From a single defocused image, our approach generates a perceptually accurate focal stack, represented as a video sequence, enabling interactive refocusing and unlocking a range of downstream applications. We release a large-scale focal stack dataset acquired under diverse real-world smartphone conditions to support this work and future research. Our method consistently outperforms existing approaches in both perceptual quality and robustness across challenging scenarios, paving the way for more advanced focus-editing capabilities in everyday photography. Code and data are available at www.learn2refocus.github.io
Related papers
- Generative Refocusing: Flexible Defocus Control from a Single Image [12.798805351731668]
We introduce Generative Refocusing, a two-step process that uses DeNet to recover all-in-focus images from various inputs and BokehNet for creating controllable bokeh.<n>Our experiments show we achieve top performance in defocus deblurring, bokeh synthesis, and refocusing benchmarks.
arXiv Detail & Related papers (2025-12-18T18:59:59Z) - Fine-grained Defocus Blur Control for Generative Image Models [66.30016220484394]
Current text-to-image diffusion models excel at generating diverse, high-quality images.<n>We introduce a novel text-to-image diffusion framework that leverages camera metadata.<n>Our model enables superior fine-grained control without altering the depicted scene.
arXiv Detail & Related papers (2025-10-07T17:59:15Z) - DiffCamera: Arbitrary Refocusing on Images [55.948229011478304]
We propose DiffCamera, a model that enables flexible refocusing of a created image conditioned on an arbitrary new focus point and a blur level.<n>Experiments demonstrate that DiffCamera supports stable refocusing across a wide range of scenes, providing unprecedented control over DoF adjustments for photography and generative AI applications.
arXiv Detail & Related papers (2025-09-30T17:48:23Z) - SparseFocus: Learning-based One-shot Autofocus for Microscopy with Sparse Content [21.268550523841117]
Autofocus is necessary for high- throughput and real-time scanning in microscopic imaging.<n>Recent learning-based approaches have demonstrated remarkable efficacy in a one-shot setting.<n>We propose a content-based solution, named SparseFocus, featuring a novel two-stage pipeline.
arXiv Detail & Related papers (2025-02-10T13:31:32Z) - Towards Real-World Focus Stacking with Deep Learning [97.34754533628322]
We introduce a new dataset consisting of 94 high-resolution bursts of raw images with focus bracketing.
This dataset is used to train the first deep learning algorithm for focus stacking capable of handling bursts of sufficient length for real-world applications.
arXiv Detail & Related papers (2023-11-29T17:49:33Z) - Defocus Map Estimation and Deblurring from a Single Dual-Pixel Image [54.10957300181677]
We present a method that takes as input a single dual-pixel image, and simultaneously estimates the image's defocus map.
Our approach improves upon prior works for both defocus map estimation and blur removal, despite being entirely unsupervised.
arXiv Detail & Related papers (2021-10-12T00:09:07Z) - An End-to-End Autofocus Camera for Iris on the Move [48.14011526385088]
In this paper, we introduce a novel rapid autofocus camera for active refocusing of the iris area ofthe moving objects using a focus-tunable lens.
Our end-to-end computational algorithm can predict the best focus position from one single blurred image and generate a lens diopter control signal automatically.
The results demonstrate the advantages of our proposed camera for biometric perception in static and dynamic scenes.
arXiv Detail & Related papers (2021-06-29T03:00:39Z) - Defocus Deblurring Using Dual-Pixel Data [41.201653787083735]
Defocus blur arises in images that are captured with a shallow depth of field due to the use of a wide aperture.
We propose an effective defocus deblurring method that exploits data available on dual-pixel (DP) sensors found on most modern cameras.
arXiv Detail & Related papers (2020-05-01T10:38:00Z) - Rapid Whole Slide Imaging via Learning-based Two-shot Virtual
Autofocusing [57.90239401665367]
Whole slide imaging (WSI) is an emerging technology for digital pathology.
We propose the concept of textitvirtual autofocusing, which does not rely on mechanical adjustment to conduct refocusing.
arXiv Detail & Related papers (2020-03-14T13:40:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.