Related papers: DiffPR: Diffusion-Based Phase Reconstruction via Frequency-Decoupled Learning

DiffPR: Diffusion-Based Phase Reconstruction via Frequency-Decoupled Learning

URL: http://arxiv.org/abs/2506.11183v1
Date: Thu, 12 Jun 2025 17:08:45 GMT
Title: DiffPR: Diffusion-Based Phase Reconstruction via Frequency-Decoupled Learning
Authors: Yi Zhang,
Abstract summary: Oversmoothing remains a persistent problem when applying deep learning to off-axis quantitative phase imaging (QPI)<n>We trace this issue to spectral bias and show that the bias is reinforced by high-level skip connections.<n>We introduce DiffPR, a two-stage frequency-decoupled framework.
Score: 4.560284382063488
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Oversmoothing remains a persistent problem when applying deep learning to off-axis quantitative phase imaging (QPI). End-to-end U-Nets favour low-frequency content and under-represent fine, diagnostic detail. We trace this issue to spectral bias and show that the bias is reinforced by high-level skip connections that feed high-frequency features directly into the decoder. Removing those deepest skips thus supervising the network only at a low resolution significantly improves generalisation and fidelity. Building on this insight, we introduce DiffPR, a two-stage frequency-decoupled framework. Stage 1: an asymmetric U-Net with cancelled high-frequency skips predicts a quarter-scale phase map from the interferogram, capturing reliable low-frequency structure while avoiding spectral bias. Stage 2: the upsampled prediction, lightly perturbed with Gaussian noise, is refined by an unconditional diffusion model that iteratively recovers the missing high-frequency residuals through reverse denoising. Experiments on four QPI datasets (B-Cell, WBC, HeLa, 3T3) show that DiffPR outperforms strong U-Net baselines, boosting PSNR by up to 1.1 dB and reducing MAE by 11 percent, while delivering markedly sharper membrane ridges and speckle patterns. The results demonstrate that cancelling high-level skips and delegating detail synthesis to a diffusion prior is an effective remedy for the spectral bias that limits conventional phase-retrieval networks.

Related papers

Adaptive Spectral Feature Forecasting for Diffusion Sampling Acceleration [58.19554276924402]
We propose spectral diffusion feature forecaster (Spectrum) to enable global, long-range feature reuse with tightly controlled error.<n>We achieve up to 4.79$times$ speedup on FLUX.1 and 4.67$times$ speedup on Wan2.1-14B, while maintaining much higher sample quality compared with the baselines.
arXiv Detail & Related papers (2026-03-02T08:59:11Z)
Stabilizing Diffusion Posterior Sampling by Noise--Frequency Continuation [52.736416985173776]
At high noise, data-consistency gradients computed from inaccurate estimates can be geometrically incongruent with the posterior geometry.<n>We propose a noise--frequency Continuation framework that constructs a continuous family of intermediate posteriors whose likelihood enforces measurement consistency only within a noise-dependent frequency band.<n>Our method achieves state-of-the-art performance and improves motion deblurring PSNR by up to 5 dB over strong baselines.
arXiv Detail & Related papers (2026-01-30T03:14:01Z)
NFCDS: A Plug-and-Play Noise Frequency-Controlled Diffusion Sampling Strategy for Image Restoration [20.351955950047348]
Diffusion-based Plug-and-Play (NFC) methods produce images with high quality but often suffer from reduced fidelity data.<n>We propose Frequency Diffusion-led Sampling (NFCDS), a modulation mechanism for reverse diffusion noise.
arXiv Detail & Related papers (2026-01-29T04:10:45Z)
FRAMER: Frequency-Aligned Self-Distillation with Adaptive Modulation Leveraging Diffusion Priors for Real-World Image Super-Resolution [6.767948729335409]
Real-image super-resolution (Real-ISR) seeks to recover HR images from LR inputs with mixed, unknown degradations.<n>We introduce FRAMER, a plug-and-play training scheme that exploits diffusion priors without changing the backbone or inference.
arXiv Detail & Related papers (2025-12-01T08:09:05Z)
Toward Diffusible High-Dimensional Latent Spaces: A Frequency Perspective [73.86108756585857]
We analyze encoder/decoder behaviors and find that decoders depend strongly on high-frequency latent components to recover details.<n>We introduce FreqWarm, a plug-and-play frequency warm-up curriculum that increases early-stage exposure to high-frequency latent signals.
arXiv Detail & Related papers (2025-11-27T09:20:36Z)
SONAR: Spectral-Contrastive Audio Residuals for Generalizable Deepfake Detection [6.042897432654865]
Spectral-cONtrastive Audio Residuals (AR) is a frequency-guided framework for deepfake audio detectors.<n>AR disentangles an audio signal into complementary representations.<n> evaluated on the ASVspoof 2021 and in-the-wild benchmarks.
arXiv Detail & Related papers (2025-11-26T12:16:38Z)
HDW-SR: High-Frequency Guided Diffusion Model based on Wavelet Decomposition for Image Super-Resolution [4.388490927225987]
We propose a High-Frequency Guided Diffusion Network based on Wavelet Decomposition (HDW-SR)<n>We perform diffusion only on the residual map, allowing the network to focus more effectively on high-frequency information restoration.<n> Experiments on both synthetic and real-world datasets demonstrate that HDW-SR achieves competitive super-resolution performance.
arXiv Detail & Related papers (2025-11-17T09:25:26Z)
A Fourier Space Perspective on Diffusion Models [6.834230686279937]
Diffusion models are state-of-the-art generative models on data modalities such as images, audio, proteins and materials.<n>We study the inductive bias of the forward process of diffusion models in Fourier space.
arXiv Detail & Related papers (2025-05-16T14:13:02Z)
FUSE: Label-Free Image-Event Joint Monocular Depth Estimation via Frequency-Decoupled Alignment and Degradation-Robust Fusion [63.87313550399871]
Image-event joint depth estimation methods leverage complementary modalities for robust perception, yet face challenges in generalizability.<n>We propose Self-supervised Transfer (PST) and FrequencyDe-coupled Fusion module (FreDF)<n>PST establishes cross-modal knowledge transfer through latent space alignment with image foundation models.<n>FreDF explicitly decouples high-frequency edge features from low-frequency structural components, resolving modality-specific frequency mismatches.
arXiv Detail & Related papers (2025-03-25T15:04:53Z)
SING: Semantic Image Communications using Null-Space and INN-Guided Diffusion Models [52.40011613324083]
Joint source-channel coding systems (DeepJSCC) have recently demonstrated remarkable performance in wireless image transmission.<n>Existing methods focus on minimizing distortion between the transmitted image and the reconstructed version at the receiver, often overlooking perceptual quality.<n>We propose SING, a novel framework that formulates the recovery of high-quality images from corrupted reconstructions as an inverse problem.
arXiv Detail & Related papers (2025-03-16T12:32:11Z)
High Frequency Matters: Uncertainty Guided Image Compression with Wavelet Diffusion [4.76749587454871]
We propose an efficient Uncertainty-Guided image compression approach with wavelet Diffusion (UGDiff)<n>Our approach focuses on high frequency compression via the wavelet transform, since high frequency components are crucial for reconstructing image details.<n> Comprehensive experiments on two benchmark datasets validate the effectiveness of UGDiff.
arXiv Detail & Related papers (2024-07-17T13:21:31Z)
UDHF2-Net: Uncertainty-diffusion-model-based High-Frequency TransFormer Network for Remotely Sensed Imagery Interpretation [17.289252835606533]
Uncertainty-diffusion-model-based high-Frequency TransFormer network (UDHF2-Net) is the first to be proposed.<n> UDHF2-Net is a spatially-stationary-and-non-stationary high-frequency connection paradigm (SHCP)<n>Mask-and-geo-knowledge-based uncertainty diffusion module (MUDM) is a self-supervised learning strategy.<n>A frequency-wise semi-pseudo-Siamese UDHF2-Net is the first to be proposed to balance accuracy and complexity for change detection.
arXiv Detail & Related papers (2024-06-23T15:03:35Z)
Frequency-Domain Refinement with Multiscale Diffusion for Super Resolution [7.29314801047906]
We propose a novel Frequency Domain-guided multiscale Diffusion model (FDDiff) FDDiff decomposes the high-frequency information complementing process into finer-grained steps. We show that FDDiff outperforms prior generative methods with higher-fidelity super-resolution results.
arXiv Detail & Related papers (2024-05-16T11:58:52Z)
Denoising Diffusion Models for Plug-and-Play Image Restoration [135.6359475784627]
This paper proposes DiffPIR, which integrates the traditional plug-and-play method into the diffusion sampling framework. Compared to plug-and-play IR methods that rely on discriminative Gaussian denoisers, DiffPIR is expected to inherit the generative ability of diffusion models.
arXiv Detail & Related papers (2023-05-15T20:24:38Z)
DiffusionAD: Norm-guided One-step Denoising Diffusion for Anomaly Detection [80.20339155618612]
DiffusionAD is a novel anomaly detection pipeline comprising a reconstruction sub-network and a segmentation sub-network.<n>A rapid one-step denoising paradigm achieves hundreds of times acceleration while preserving comparable reconstruction quality.<n>Considering the diversity in the manifestation of anomalies, we propose a norm-guided paradigm to integrate the benefits of multiple noise scales.
arXiv Detail & Related papers (2023-03-15T16:14:06Z)
NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction [79.13750275141139]
This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction. The desired attenuation coefficients are represented as a continuous function of 3D spatial coordinates, parameterized by a fully-connected deep neural network. A learning-based encoder entailing hash coding is adopted to help the network capture high-frequency details.
arXiv Detail & Related papers (2022-09-29T04:06:00Z)
Exploring Inter-frequency Guidance of Image for Lightweight Gaussian Denoising [1.52292571922932]
We propose a novel network architecture denoted as IGNet, in order to refine the frequency bands from low to high in a progressive manner. With this design, more inter-frequency prior and information are utilized, thus the model size can be lightened while still perserves competitive results.
arXiv Detail & Related papers (2021-12-22T10:35:53Z)
Hyperspectral Image Super-resolution via Deep Progressive Zero-centric Residual Learning [62.52242684874278]
Cross-modality distribution of spatial and spectral information makes the problem challenging. We propose a novel textitlightweight deep neural network-based framework, namely PZRes-Net. Our framework learns a high resolution and textitzero-centric residual image, which contains high-frequency spatial details of the scene.
arXiv Detail & Related papers (2020-06-18T06:32:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.