Toward Generalizable Deblurring: Leveraging Massive Blur Priors with Linear Attention for Real-World Scenarios
- URL: http://arxiv.org/abs/2601.06525v1
- Date: Sat, 10 Jan 2026 11:01:31 GMT
- Title: Toward Generalizable Deblurring: Leveraging Massive Blur Priors with Linear Attention for Real-World Scenarios
- Authors: Yuanting Gao, Shuo Cao, Xiaohui Li, Yuandong Pu, Yihao Liu, Kai Zhang,
- Abstract summary: GLOWDeblur is a Generalizable reaL-wOrld lightWeight Deblur model that combines convolution-based pre-reconstruction & domain alignment module with a lightweight diffusion backbone.<n>We propose Blur Pattern Pretraining (BPP), which acquires blur priors from simulation datasets and transfers them through joint fine-tuning on real data.<n>We further introduce Motion and Semantic Guidance (MoSeG) to strengthen blur priors under severe degradation, and integrate it into GLOWDeblur, a Generalizable reaL-wOrld lightWeight Deblur model that combines convolution-based pre-reconstruction &
- Score: 9.82847623835017
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image deblurring has advanced rapidly with deep learning, yet most methods exhibit poor generalization beyond their training datasets, with performance dropping significantly in real-world scenarios. Our analysis shows this limitation stems from two factors: datasets face an inherent trade-off between realism and coverage of diverse blur patterns, and algorithmic designs remain restrictive, as pixel-wise losses drive models toward local detail recovery while overlooking structural and semantic consistency, whereas diffusion-based approaches, though perceptually strong, still fail to generalize when trained on narrow datasets with simplistic strategies. Through systematic investigation, we identify blur pattern diversity as the decisive factor for robust generalization and propose Blur Pattern Pretraining (BPP), which acquires blur priors from simulation datasets and transfers them through joint fine-tuning on real data. We further introduce Motion and Semantic Guidance (MoSeG) to strengthen blur priors under severe degradation, and integrate it into GLOWDeblur, a Generalizable reaL-wOrld lightWeight Deblur model that combines convolution-based pre-reconstruction & domain alignment module with a lightweight diffusion backbone. Extensive experiments on six widely-used benchmarks and two real-world datasets validate our approach, confirming the importance of blur priors for robust generalization and demonstrating that the lightweight design of GLOWDeblur ensures practicality in real-world applications. The project page is available at https://vegdog007.github.io/GLOWDeblur_Website/.
Related papers
- Scaling Dense Event-Stream Pretraining from Visual Foundation Models [112.44243079477137]
We launch a novel self-supervised pretraining method that distills visual foundation models (VFMs) to push the boundaries of event representation at scale.<n>We curate an extensive synchronized image-event collection to amplify cross-modal alignment.<n>We extend the alignment objective to semantic structures provided off-the-shelf by VFMs, indicating a broader receptive field and stronger supervision.
arXiv Detail & Related papers (2026-03-04T12:06:09Z) - StepVAR: Structure-Texture Guided Pruning for Visual Autoregressive Models [98.72926158261937]
We propose a training-free token pruning framework for Visual AutoRegressive models.<n>We employ a lightweight high-pass filter to capture local texture details, while leveraging Principal Component Analysis (PCA) to preserve global structural information.<n>To maintain valid next-scale prediction under sparse tokens, we introduce a nearest neighbor feature propagation strategy.
arXiv Detail & Related papers (2026-03-02T11:35:05Z) - Keyframe-Based Feed-Forward Visual Odometry [13.646685343885556]
Current foundation model based methods typically process raw image sequences indiscriminately.<n>We propose a novel feed-forward VO method that employs reinforcement learning to derive an adaptive visual policy in a data-driven manner.<n> Experimental results demonstrate that the proposed method achieves consistent and substantial improvements over state-of-the-art feed-forward VO methods.
arXiv Detail & Related papers (2026-01-22T14:45:42Z) - Leveraging Large-Scale Pretrained Spatial-Spectral Priors for General Zero-Shot Pansharpening [7.37033839561317]
Existing deep learning methods for remote sensing image fusion often suffer from poor generalization when applied to unseen datasets.<n>We propose a novel pretraining strategy that leverages large-scale simulated datasets to learn robust spatial-spectral priors.<n>The pretrained models achieve superior results in zero-shot scenarios and show remarkable adaptation capability with minimal real data in one-shot settings.
arXiv Detail & Related papers (2025-12-02T10:56:26Z) - Zero-Reference Joint Low-Light Enhancement and Deblurring via Visual Autoregressive Modeling with VLM-Derived Modulation [18.67176370944511]
Real-world dark images commonly exhibit not only low visibility and contrast but also complex noise and blur, posing significant restoration challenges.<n>We propose a generative framework based on visual autoregressive ( VAR) modeling, guided by perceptual priors from the vision-language model (VLM)<n>Our framework is fully unsupervised and achieves state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2025-11-23T19:08:45Z) - Intern-GS: Vision Model Guided Sparse-View 3D Gaussian Splatting [95.61137026932062]
Intern-GS is a novel approach to enhance the process of sparse-view Gaussian splatting.<n>We show that Intern-GS achieves state-of-the-art rendering quality across diverse datasets.
arXiv Detail & Related papers (2025-05-27T05:17:49Z) - Exploring Generalized Gait Recognition: Reducing Redundancy and Noise within Indoor and Outdoor Datasets [24.242460774158463]
Generalized gait recognition aims to achieve robust performance across diverse domains.<n>Mixed-dataset training is widely used to enhance generalization.<n>We propose a unified framework that systematically improves cross-domain gait recognition.
arXiv Detail & Related papers (2025-05-21T06:46:09Z) - Boosting Zero-shot Stereo Matching using Large-scale Mixed Images Sources in the Real World [8.56549004133167]
Stereo matching methods rely on dense pixel-wise ground truth labels.<n>The scarcity of labeled data and domain gaps between synthetic and real-world images pose notable challenges.<n>We propose a novel framework, textbfBooSTer, that leverages both vision foundation models and large-scale mixed image sources.
arXiv Detail & Related papers (2025-05-13T14:24:38Z) - Low-Light Image Enhancement via Generative Perceptual Priors [75.01646333310073]
We introduce a novel textbfLLIE framework with the guidance of vision-language models (VLMs)<n>We first propose a pipeline that guides VLMs to assess multiple visual attributes of the LL image and quantify the assessment to output the global and local perceptual priors.<n>To incorporate these generative perceptual priors to benefit LLIE, we introduce a transformer-based backbone in the diffusion process, and develop a new layer normalization (textittextbfLPP-Attn) guided by global and local perceptual priors.
arXiv Detail & Related papers (2024-12-30T12:51:52Z) - Generalizing Event-Based Motion Deblurring in Real-World Scenarios [62.995994797897424]
Event-based motion deblurring has shown promising results by exploiting low-latency events.
We propose a scale-aware network that allows flexible input spatial scales and enables learning from different temporal scales of motion blur.
A two-stage self-supervised learning scheme is then developed to fit real-world data distribution.
arXiv Detail & Related papers (2023-08-11T04:27:29Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - Neural BRDF Representation and Importance Sampling [79.84316447473873]
We present a compact neural network-based representation of reflectance BRDF data.
We encode BRDFs as lightweight networks, and propose a training scheme with adaptive angular sampling.
We evaluate encoding results on isotropic and anisotropic BRDFs from multiple real-world datasets.
arXiv Detail & Related papers (2021-02-11T12:00:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.