Related papers: PG-ControlNet: A Physics-Guided ControlNet for Generative Spatially Varying Image Deblurring

PG-ControlNet: A Physics-Guided ControlNet for Generative Spatially Varying Image Deblurring

URL: http://arxiv.org/abs/2511.21043v1
Date: Wed, 26 Nov 2025 04:19:51 GMT
Title: PG-ControlNet: A Physics-Guided ControlNet for Generative Spatially Varying Image Deblurring
Authors: Hakki Motorcu, Mujdat Cetin,
Abstract summary: We propose a novel framework to tame spatially varying image deblurring.<n>Rather than oversimplifying the degradation field, we model it as a dense continuum of high-dimensional compressed kernels.<n>Our method effectively bridges the gap between physical accuracy and perceptual realism.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Spatially varying image deblurring remains a fundamentally ill-posed problem, especially when degradations arise from complex mixtures of motion and other forms of blur under significant noise. State-of-the-art learning-based approaches generally fall into two paradigms: model-based deep unrolling methods that enforce physical constraints by modeling the degradations, but often produce over-smoothed, artifact-laden textures, and generative models that achieve superior perceptual quality yet hallucinate details due to weak physical constraints. In this paper, we propose a novel framework that uniquely reconciles these paradigms by taming a powerful generative prior with explicit, dense physical constraints. Rather than oversimplifying the degradation field, we model it as a dense continuum of high-dimensional compressed kernels, ensuring that minute variations in motion and other degradation patterns are captured. We leverage this rich descriptor field to condition a ControlNet architecture, strongly guiding the diffusion sampling process. Extensive experiments demonstrate that our method effectively bridges the gap between physical accuracy and perceptual realism, outperforming state-of-the-art model-based methods as well as generative baselines in challenging, severely blurred scenarios.

Related papers

PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models [100.65199317765608]
Physical principles are fundamental to realistic visual simulation, but remain a significant oversight in transformer-based video generation.<n>We introduce a physics-aware reinforcement learning paradigm for video generation models that enforces physical collision rules directly in high-dimensional spaces.<n>We extend this paradigm to a unified framework, termed Mimicry-Discovery Cycle (MDcycle), which allows substantial fine-tuning.
arXiv Detail & Related papers (2026-01-16T08:40:10Z)
Beyond the Black Box: Identifiable Interpretation and Control in Generative Models via Causal Minimality [52.57416398859353]
We show that causal minimality can endow latent representations of diffusion vision and autoregressive language models with clear causal interpretation and robust, component-wise identifiable control.<n>We introduce a novel theoretical framework for hierarchical selection models, where higher-level concepts emerge from the constrained composition of lower-level variables.<n>These causally grounded concepts serve as levers for fine-grained model steering, paving the way for transparent, reliable systems.
arXiv Detail & Related papers (2025-12-11T14:59:14Z)
Penalizing Boundary Activation for Object Completeness in Diffusion Models [35.58050562158284]
Diffusion models have emerged as a powerful technique for text-to-image (T2I) generation.<n>In this study, we conduct an in-depth analysis of the incompleteness issue and reveal that the primary factor behind incomplete object generation is the usage of RandomCrop during model training.<n>We propose a training-free solution that penalizes activation values at image boundaries during the early denoising steps.
arXiv Detail & Related papers (2025-09-21T07:58:48Z)
VDEGaussian: Video Diffusion Enhanced 4D Gaussian Splatting for Dynamic Urban Scenes Modeling [68.65587507038539]
We present a novel video diffusion-enhanced 4D Gaussian Splatting framework for dynamic urban scene modeling.<n>Our key insight is to distill robust, temporally consistent priors from a test-time adapted video diffusion model.<n>Our method significantly enhances dynamic modeling, especially for fast-moving objects, achieving an approximate PSNR gain of 2 dB.
arXiv Detail & Related papers (2025-08-04T07:24:05Z)
BokehDiff: Neural Lens Blur with One-Step Diffusion [62.59018200914645]
We introduce BokehDiff, a lens blur rendering method that achieves physically accurate and visually appealing outcomes.<n>Our method employs a physics-inspired self-attention module that aligns with the image formation process.<n>We adapt the diffusion model to the one-step inference scheme without introducing additional noise, and achieve results of high quality and fidelity.
arXiv Detail & Related papers (2025-07-24T03:23:19Z)
RichControl: Structure- and Appearance-Rich Training-Free Spatial Control for Text-to-Image Generation [10.956556608715035]
Text-to-image (T2I) diffusion models have shown remarkable success in generating high-quality images from text prompts.<n>We propose a flexible training-free framework that decouples the sampling schedule of condition features from the denoising process.<n>We further enhance the sampling process by introducing a restart refinement schedule, and improve the visual quality with an appearance-rich prompting strategy.
arXiv Detail & Related papers (2025-07-03T16:56:15Z)
Diffusion Dynamics Models with Generative State Estimation for Cloth Manipulation [31.868248649812088]
Cloth manipulation is challenging due to its highly complex dynamics, near-infinite degrees of freedom, and frequent self-occlusions.<n>We propose a diffusion-based generative approach for both perception and dynamics modeling.<n>We show that our framework enables effective cloth folding on real robotic systems.
arXiv Detail & Related papers (2025-03-15T05:34:26Z)
One-Step Diffusion Model for Image Motion-Deblurring [85.76149042561507]
We propose a one-step diffusion model for deblurring (OSDD), a novel framework that reduces the denoising process to a single step.<n>To tackle fidelity loss in diffusion models, we introduce an enhanced variational autoencoder (eVAE), which improves structural restoration.<n>Our method achieves strong performance on both full and no-reference metrics.
arXiv Detail & Related papers (2025-03-09T09:39:57Z)
HRR: Hierarchical Retrospection Refinement for Generated Image Detection [16.958383381415445]
We propose a diffusion model-based generative image detection framework termed Hierarchical Retrospection Refinement(HRR)<n>The HRR framework consistently delivers significant performance improvements, outperforming state-of-the-art methods in generated image detection task.
arXiv Detail & Related papers (2025-02-25T05:13:44Z)
FFHFlow: Diverse and Uncertainty-Aware Dexterous Grasp Generation via Flow Variational Inference [36.02645364048733]
We propose FFHFlow, a flow-based variational framework that generates diverse, robust multi-finger grasps.<n>By exploiting the invertibility and exact likelihoods of flows, FFHFlow introspects shape uncertainty in partial observations.<n>We also integrate a discriminative grasp evaluator with the flow likelihoods, formulating an uncertainty-aware ranking strategy.
arXiv Detail & Related papers (2024-07-21T13:33:08Z)
SpikeReveal: Unlocking Temporal Sequences from Real Blurry Inputs with Spike Streams [44.02794438687478]
Spike cameras have proven effective in capturing motion features and beneficial for solving this ill-posed problem. Existing methods fall into the supervised learning paradigm, which suffers from notable performance degradation when applied to real-world scenarios. We propose the first self-supervised framework for the task of spike-guided motion deblurring.
arXiv Detail & Related papers (2024-03-14T15:29:09Z)
Toward Certified Robustness Against Real-World Distribution Shifts [65.66374339500025]
We train a generative model to learn perturbations from data and define specifications with respect to the output of the learned model. A unique challenge arising from this setting is that existing verifiers cannot tightly approximate sigmoid activations. We propose a general meta-algorithm for handling sigmoid activations which leverages classical notions of counter-example-guided abstraction refinement.
arXiv Detail & Related papers (2022-06-08T04:09:13Z)
Image Reconstruction of Static and Dynamic Scenes through Anisoplanatic Turbulence [1.6114012813668934]
We present a unified method for atmospheric turbulence mitigation in both static and dynamic sequences. We are able to achieve better results compared to existing methods by utilizing a novel space-time non-local averaging method.
arXiv Detail & Related papers (2020-08-31T19:20:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.