Degradation-Modeled Multipath Diffusion for Tunable Metalens Photography
- URL: http://arxiv.org/abs/2506.22753v1
- Date: Sat, 28 Jun 2025 04:48:37 GMT
- Title: Degradation-Modeled Multipath Diffusion for Tunable Metalens Photography
- Authors: Jianing Zhang, Jiayi Zhu, Feiyu Ji, Xiaokang Yang, Xiaoyun Yuan,
- Abstract summary: We introduce Degradation-Modeled Multipath Diffusion for tunable metalens photography.<n>We balance high-frequency detail generation, structural fidelity, and suppression of metalens-specific degradation.<n>We build a millimeter-scale MetaCamera for real-world validation.
- Score: 44.164180405913456
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Metalenses offer significant potential for ultra-compact computational imaging but face challenges from complex optical degradation and computational restoration difficulties. Existing methods typically rely on precise optical calibration or massive paired datasets, which are non-trivial for real-world imaging systems. Furthermore, a lack of control over the inference process often results in undesirable hallucinated artifacts. We introduce Degradation-Modeled Multipath Diffusion for tunable metalens photography, leveraging powerful natural image priors from pretrained models instead of large datasets. Our framework uses positive, neutral, and negative-prompt paths to balance high-frequency detail generation, structural fidelity, and suppression of metalens-specific degradation, alongside \textit{pseudo} data augmentation. A tunable decoder enables controlled trade-offs between fidelity and perceptual quality. Additionally, a spatially varying degradation-aware attention (SVDA) module adaptively models complex optical and sensor-induced degradation. Finally, we design and build a millimeter-scale MetaCamera for real-world validation. Extensive results show that our approach outperforms state-of-the-art methods, achieving high-fidelity and sharp image reconstruction. More materials: https://dmdiff.github.io/.
Related papers
- LensNet: An End-to-End Learning Framework for Empirical Point Spread Function Modeling and Lensless Imaging Reconstruction [32.85180149439811]
Lensless imaging stands out as a promising alternative to conventional lens-based systems.<n>Traditional lensless techniques often require explicit calibrations and extensive pre-processing.<n>We propose LensNet, an end-to-end deep learning framework that integrates spatial-domain and frequency-domain representations.
arXiv Detail & Related papers (2025-05-03T09:11:52Z) - Towards Realistic Low-Light Image Enhancement via ISP Driven Data Modeling [61.95831392879045]
Deep neural networks (DNNs) have recently become the leading method for low-light image enhancement (LLIE)<n>Despite significant progress, their outputs may still exhibit issues such as amplified noise, incorrect white balance, or unnatural enhancements when deployed in real world applications.<n>A key challenge is the lack of diverse, large scale training data that captures the complexities of low-light conditions and imaging pipelines.<n>We propose a novel image signal processing (ISP) driven data synthesis pipeline that addresses these challenges by generating unlimited paired training data.
arXiv Detail & Related papers (2025-04-16T15:53:53Z) - Nonlocal Retinex-Based Variational Model and its Deep Unfolding Twin for Low-Light Image Enhancement [3.174882428337821]
We propose a variational method for low-light image enhancement based on the Retinex decomposition.<n>A color correction pre-processing step is applied to the low-light image, which is then used as the observed input in the decomposition.<n>We extend the model by introducing its deep unfolding counterpart, in which the operators are replaced with learnable networks.
arXiv Detail & Related papers (2025-04-10T14:48:26Z) - FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration [66.61201445650323]
Existing methods suffer from a generalization bottleneck in real-world scenarios.<n>We contribute a million-scale dataset with two notable advantages over existing training data.<n>We propose a robust model, FoundIR, to better address a broader range of restoration tasks in real-world scenarios.
arXiv Detail & Related papers (2024-12-02T12:08:40Z) - AoSRNet: All-in-One Scene Recovery Networks via Multi-knowledge
Integration [17.070755601209136]
We propose an all-in-one scene recovery network via multi-knowledge integration (termed AoSRNet)
It combines gamma correction (GC) and optimized linear stretching (OLS) to create the detail enhancement module (DEM) and color restoration module ( CRM)
Comprehensive experimental results demonstrate the effectiveness and stability of AoSRNet compared to other state-of-the-art methods.
arXiv Detail & Related papers (2024-02-06T06:12:03Z) - Improving Lens Flare Removal with General Purpose Pipeline and Multiple
Light Sources Recovery [69.71080926778413]
flare artifacts can affect image visual quality and downstream computer vision tasks.
Current methods do not consider automatic exposure and tone mapping in image signal processing pipeline.
We propose a solution to improve the performance of lens flare removal by revisiting the ISP and design a more reliable light sources recovery strategy.
arXiv Detail & Related papers (2023-08-31T04:58:17Z) - Neural Invertible Variable-degree Optical Aberrations Correction [6.6855248718044225]
We propose a novel aberration correction method with an invertible architecture by leveraging its information-lossless property.
Within the architecture, we develop conditional invertible blocks to allow the processing of aberrations with variable degrees.
Our method is evaluated on both a synthetic dataset from physics-based imaging simulation and a real captured dataset.
arXiv Detail & Related papers (2023-04-12T01:56:42Z) - DR2: Diffusion-based Robust Degradation Remover for Blind Face
Restoration [66.01846902242355]
Blind face restoration usually synthesizes degraded low-quality data with a pre-defined degradation model for training.
It is expensive and infeasible to include every type of degradation to cover real-world cases in the training data.
We propose Robust Degradation Remover (DR2) to first transform the degraded image to a coarse but degradation-invariant prediction, then employ an enhancement module to restore the coarse prediction to a high-quality image.
arXiv Detail & Related papers (2023-03-13T06:05:18Z) - A Trainable Spectral-Spatial Sparse Coding Model for Hyperspectral Image
Restoration [36.525810477650026]
Hyperspectral imaging offers new perspectives for diverse applications.
The lack of accurate ground-truth "clean" hyperspectral signals on the spot makes restoration tasks challenging.
In this paper, we advocate for a hybrid approach based on sparse coding principles.
arXiv Detail & Related papers (2021-11-18T14:16:04Z) - DA-DRN: Degradation-Aware Deep Retinex Network for Low-Light Image
Enhancement [14.75902042351609]
We propose a Degradation-Aware Deep Retinex Network (denoted as DA-DRN) for low-light image enhancement and tackle the above degradation.
Based on Retinex Theory, the decomposition net in our model can decompose low-light images into reflectance and illumination maps.
We conduct extensive experiments to demonstrate that our approach achieves a promising effect with good rubustness and generalization.
arXiv Detail & Related papers (2021-10-05T03:53:52Z) - Universal and Flexible Optical Aberration Correction Using Deep-Prior
Based Deconvolution [51.274657266928315]
We propose a PSF aware plug-and-play deep network, which takes the aberrant image and PSF map as input and produces the latent high quality version via incorporating lens-specific deep priors.
Specifically, we pre-train a base model from a set of diverse lenses and then adapt it to a given lens by quickly refining the parameters.
arXiv Detail & Related papers (2021-04-07T12:00:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.