Breaking Scale Anchoring: Frequency Representation Learning for Accurate High-Resolution Inference from Low-Resolution Training
- URL: http://arxiv.org/abs/2512.05132v1
- Date: Fri, 28 Nov 2025 09:52:37 GMT
- Title: Breaking Scale Anchoring: Frequency Representation Learning for Accurate High-Resolution Inference from Low-Resolution Training
- Authors: Wenshuo Wang, Fan Zhang,
- Abstract summary: Zero-Shot Super-Resolution Spatiotemporal Forecasting requires a deep learning model to be trained on low-resolution data and deployed for inference on high-resolution.<n>Existing studies consider maintaining similar error across different resolutions as indicative of successful generalization.<n>Deep learning models serving as alternatives to numerical solvers should reduce error as resolution increases.
- Score: 5.24655241578805
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Zero-Shot Super-Resolution Spatiotemporal Forecasting requires a deep learning model to be trained on low-resolution data and deployed for inference on high-resolution. Existing studies consider maintaining similar error across different resolutions as indicative of successful multi-resolution generalization. However, deep learning models serving as alternatives to numerical solvers should reduce error as resolution increases. The fundamental limitation is, the upper bound of physical law frequencies that low-resolution data can represent is constrained by its Nyquist frequency, making it difficult for models to process signals containing unseen frequency components during high-resolution inference. This results in errors being anchored at low resolution, incorrectly interpreted as successful generalization. We define this fundamental phenomenon as a new problem distinct from existing issues: Scale Anchoring. Therefore, we propose architecture-agnostic Frequency Representation Learning. It alleviates Scale Anchoring through resolution-aligned frequency representations and spectral consistency training: on grids with higher Nyquist frequencies, the frequency response in high-frequency bands of FRL-enhanced variants is more stable. This allows errors to decrease with resolution and significantly outperform baselines within our task and resolution range, while incurring only modest computational overhead.
Related papers
- Iterative Inference-time Scaling with Adaptive Frequency Steering for Image Super-Resolution [75.3690742776891]
We propose Iterative Diffusion Inference-Time Scaling with Adaptive Frequency Steering (IAFS)<n>IAFS addresses the challenge of balancing perceptual quality and structural fidelity by progressively refining the generated image through iterative correction of structural deviations.<n>Experiments show that IAFS effectively resolves the perception-fidelity conflict, yielding consistently improved perceptual detail and structural accuracy, and outperforming existing inference-time scaling methods.
arXiv Detail & Related papers (2025-12-29T15:09:20Z) - QuantVSR: Low-Bit Post-Training Quantization for Real-World Video Super-Resolution [53.13952833016505]
We propose a low-bit quantization model for real-world video super-resolution (VSR)<n>We use a calibration dataset to measure both spatial and temporal complexity for each layer.<n>We refine the FP and low-bit branches to achieve simultaneous optimization.
arXiv Detail & Related papers (2025-08-06T14:35:59Z) - FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion [50.43304425256732]
FreeScale is a tuning-free inference paradigm to enable higher-resolution visual generation via scale fusion.<n>We extend the capabilities of higher-resolution visual generation for both image and video models.
arXiv Detail & Related papers (2024-12-12T18:59:59Z) - FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion [63.609399000712905]
Inference at a scaled resolution leads to repetitive patterns and structural distortions.<n>We propose two simple modules that combine to solve these issues.<n>Our method, coined Fam diffusion, can seamlessly integrate into any latent diffusion model and requires no additional training.
arXiv Detail & Related papers (2024-11-27T17:51:44Z) - FreqINR: Frequency Consistency for Implicit Neural Representation with Adaptive DCT Frequency Loss [5.349799154834945]
This paper introduces Frequency Consistency for Implicit Neural Representation (FreqINR), an innovative Arbitrary-scale Super-resolution method.
During training, we employ Adaptive Discrete Cosine Transform Frequency Loss (ADFL) to minimize the frequency gap between HR and ground-truth images.
During inference, we extend the receptive field to preserve spectral coherence between low-resolution (LR) and ground-truth images.
arXiv Detail & Related papers (2024-08-25T03:53:17Z) - Frequency-Domain Refinement with Multiscale Diffusion for Super Resolution [19.327571569959062]
We propose a novel Frequency Domain-guided multiscale Diffusion model (FDDiff)<n>FDDiff decomposes the high-frequency information complementing process into finer-grained steps.<n>It outperforms prior generative methods with higher-fidelity super-resolution results.
arXiv Detail & Related papers (2024-05-16T11:58:52Z) - ACDMSR: Accelerated Conditional Diffusion Models for Single Image
Super-Resolution [84.73658185158222]
We propose a diffusion model-based super-resolution method called ACDMSR.
Our method adapts the standard diffusion model to perform super-resolution through a deterministic iterative denoising process.
Our approach generates more visually realistic counterparts for low-resolution images, emphasizing its effectiveness in practical scenarios.
arXiv Detail & Related papers (2023-07-03T06:49:04Z) - Incremental Spatial and Spectral Learning of Neural Operators for
Solving Large-Scale PDEs [86.35471039808023]
We introduce the Incremental Fourier Neural Operator (iFNO), which progressively increases the number of frequency modes used by the model.
We show that iFNO reduces total training time while maintaining or improving generalization performance across various datasets.
Our method demonstrates a 10% lower testing error, using 20% fewer frequency modes compared to the existing Fourier Neural Operator, while also achieving a 30% faster training.
arXiv Detail & Related papers (2022-11-28T09:57:15Z) - FreqNet: A Frequency-domain Image Super-Resolution Network with Dicrete
Cosine Transform [16.439669339293747]
Single image super-resolution(SISR) is an ill-posed problem that aims to obtain high-resolution (HR) output from low-resolution (LR) input.
Despite the high peak signal-to-noise ratios(PSNR) results, it is difficult to determine whether the model correctly adds desired high-frequency details.
We propose FreqNet, an intuitive pipeline from the frequency domain perspective, to solve this problem.
arXiv Detail & Related papers (2021-11-21T11:49:12Z) - Stochastic Frequency Masking to Improve Super-Resolution and Denoising
Networks [20.23203428843382]
We present an analysis, in the frequency domain, of degradation- Kernel overfitting in super-resolution.
We propose a frequency masking of images used in training to regularize the networks and address the overfitting problem.
Our technique improves state-of-the-art methods on blind super-resolution with different synthetic kernels, real super-resolution, blind Gaussian denoising, and real-image denoising.
arXiv Detail & Related papers (2020-03-16T11:21:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.