FRAMER: Frequency-Aligned Self-Distillation with Adaptive Modulation Leveraging Diffusion Priors for Real-World Image Super-Resolution
- URL: http://arxiv.org/abs/2512.01390v1
- Date: Mon, 01 Dec 2025 08:09:05 GMT
- Title: FRAMER: Frequency-Aligned Self-Distillation with Adaptive Modulation Leveraging Diffusion Priors for Real-World Image Super-Resolution
- Authors: Seungho Choi, Jeahun Sung, Jihyong Oh,
- Abstract summary: Real-image super-resolution (Real-ISR) seeks to recover HR images from LR inputs with mixed, unknown degradations.<n>We introduce FRAMER, a plug-and-play training scheme that exploits diffusion priors without changing the backbone or inference.
- Score: 6.767948729335409
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-image super-resolution (Real-ISR) seeks to recover HR images from LR inputs with mixed, unknown degradations. While diffusion models surpass GANs in perceptual quality, they under-reconstruct high-frequency (HF) details due to a low-frequency (LF) bias and a depth-wise "low-first, high-later" hierarchy. We introduce FRAMER, a plug-and-play training scheme that exploits diffusion priors without changing the backbone or inference. At each denoising step, the final-layer feature map teaches all intermediate layers. Teacher and student feature maps are decomposed into LF/HF bands via FFT masks to align supervision with the model's internal frequency hierarchy. For LF, an Intra Contrastive Loss (IntraCL) stabilizes globally shared structure. For HF, an Inter Contrastive Loss (InterCL) sharpens instance-specific details using random-layer and in-batch negatives. Two adaptive modulators, Frequency-based Adaptive Weight (FAW) and Frequency-based Alignment Modulation (FAM), reweight per-layer LF/HF signals and gate distillation by current similarity. Across U-Net and DiT backbones (e.g., Stable Diffusion 2, 3), FRAMER consistently improves PSNR/SSIM and perceptual metrics (LPIPS, NIQE, MANIQA, MUSIQ). Ablations validate the final-layer teacher and random-layer negatives.
Related papers
- Zero-shot Low-Field MRI Enhancement via Diffusion-Based Adaptive Contrast Transport [31.03814753569979]
Low-field (LF) magnetic resonance imaging (MRI) democratizes access to diagnostic imaging.<n>DACT is a novel zero-shot framework that restores HF-quality images without paired supervision.<n>It achieves state-of-the-art performance, yielding reconstructions with superior structural detail and correct tissue contrast.
arXiv Detail & Related papers (2026-03-02T14:27:55Z) - SONAR: Spectral-Contrastive Audio Residuals for Generalizable Deepfake Detection [6.042897432654865]
Spectral-cONtrastive Audio Residuals (AR) is a frequency-guided framework for deepfake audio detectors.<n>AR disentangles an audio signal into complementary representations.<n> evaluated on the ASVspoof 2021 and in-the-wild benchmarks.
arXiv Detail & Related papers (2025-11-26T12:16:38Z) - Rethinking Diffusion Model-Based Video Super-Resolution: Leveraging Dense Guidance from Aligned Features [51.5076190312734]
Video Super-Resolution approaches suffer from error accumulation, spatial artifacts, and a trade-off between perceptual quality and fidelity.<n>We propose a novelly Guided diffusion model with Aligned Features for Video Super-Resolution (DGAF-VSR)<n>Experiments on synthetic and real-world datasets demonstrate that DGAF-VSR surpasses state-of-the-art methods in key aspects of VSR.
arXiv Detail & Related papers (2025-11-21T03:40:45Z) - Frequency Domain-Based Diffusion Model for Unpaired Image Dehazing [92.61216319417208]
We propose a novel frequency domain-based diffusion model, named ours, for fully exploiting the beneficial knowledge in unpaired clear data.<n>Inspired by the strong generative ability shown by Diffusion Models (DMs), we tackle the dehazing task from the perspective of frequency domain reconstruction.
arXiv Detail & Related papers (2025-07-02T01:22:46Z) - FADPNet: Frequency-Aware Dual-Path Network for Face Super-Resolution [70.61549422952193]
Face super-resolution (FSR) under limited computational costs remains an open problem.<n>Existing approaches typically treat all facial pixels equally, resulting in suboptimal allocation of computational resources.<n>We propose FADPNet, a Frequency-Aware Dual-Path Network that decomposes facial features into low- and high-frequency components.
arXiv Detail & Related papers (2025-06-17T02:33:42Z) - DiffPR: Diffusion-Based Phase Reconstruction via Frequency-Decoupled Learning [4.560284382063488]
Oversmoothing remains a persistent problem when applying deep learning to off-axis quantitative phase imaging (QPI)<n>We trace this issue to spectral bias and show that the bias is reinforced by high-level skip connections.<n>We introduce DiffPR, a two-stage frequency-decoupled framework.
arXiv Detail & Related papers (2025-06-12T17:08:45Z) - Freqformer: Image-Demoiréing Transformer via Efficient Frequency Decomposition [83.40450475728792]
We present Freqformer, a Transformer-based framework specifically designed for image demoir'eing through targeted frequency separation.<n>Our method performs an effective frequency decomposition that explicitly splits moir'e patterns into high-frequency spatially-localized textures and low-frequency scale-robust color distortions.<n>Experiments on various demoir'eing benchmarks demonstrate that Freqformer achieves state-of-the-art performance with a compact model size.
arXiv Detail & Related papers (2025-05-25T12:23:10Z) - STAF: Sinusoidal Trainable Activation Functions for Implicit Neural Representation [7.2888019138115245]
Implicit Neural Representations (INRs) have emerged as a powerful framework for modeling continuous signals.<n>The spectral bias of ReLU-based networks is a well-established limitation, restricting their ability to capture fine-grained details in target signals.<n>We introduce Sinusoidal Trainable Functions Activation (STAF)<n>STAF inherently modulates its frequency components, allowing for self-adaptive spectral learning.
arXiv Detail & Related papers (2025-02-02T18:29:33Z) - Few-shot NeRF by Adaptive Rendering Loss Regularization [78.50710219013301]
Novel view synthesis with sparse inputs poses great challenges to Neural Radiance Field (NeRF)
Recent works demonstrate that the frequency regularization of Positional rendering can achieve promising results for few-shot NeRF.
We propose Adaptive Rendering loss regularization for few-shot NeRF, dubbed AR-NeRF.
arXiv Detail & Related papers (2024-10-23T13:05:26Z) - Frequency-aware Feature Fusion for Dense Image Prediction [99.85757278772262]
We propose Frequency-Aware Feature Fusion (FreqFusion) for dense image prediction tasks.
FreqFusion integrates an Adaptive Low-Pass Filter (ALPF) generator, an offset generator, and an Adaptive High-Pass Filter (AHPF) generator.
Comprehensive visualization and quantitative analysis demonstrate that FreqFusion effectively improves feature consistency and sharpens object boundaries.
arXiv Detail & Related papers (2024-08-23T07:30:34Z) - Frequency Compensated Diffusion Model for Real-scene Dehazing [6.105813272271171]
We consider a dehazing framework based on conditional diffusion models for improved generalization to real haze.
The proposed dehazing diffusion model significantly outperforms state-of-the-art methods on real-world images.
arXiv Detail & Related papers (2023-08-21T06:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.