Coarse-to-Fine Embedded PatchMatch and Multi-Scale Dynamic Aggregation
for Reference-based Super-Resolution
- URL: http://arxiv.org/abs/2201.04358v1
- Date: Wed, 12 Jan 2022 08:40:23 GMT
- Title: Coarse-to-Fine Embedded PatchMatch and Multi-Scale Dynamic Aggregation
for Reference-based Super-Resolution
- Authors: Bin Xia, Yapeng Tian, Yucheng Hang, Wenming Yang, Qingmin Liao, Jie
Zhou
- Abstract summary: We propose an Accelerated Multi-Scale Aggregation network (AMSA) for Reference-based Super-Resolution.
The proposed AMSA achieves superior performance over state-of-the-art approaches on both quantitative and qualitative evaluations.
- Score: 48.093500219958834
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reference-based super-resolution (RefSR) has made significant progress in
producing realistic textures using an external reference (Ref) image. However,
existing RefSR methods obtain high-quality correspondence matchings consuming
quadratic computation resources with respect to the input size, limiting its
application. Moreover, these approaches usually suffer from scale misalignments
between the low-resolution (LR) image and Ref image. In this paper, we propose
an Accelerated Multi-Scale Aggregation network (AMSA) for Reference-based
Super-Resolution, including Coarse-to-Fine Embedded PatchMatch (CFE-PatchMatch)
and Multi-Scale Dynamic Aggregation (MSDA) module. To improve matching
efficiency, we design a novel Embedded PatchMacth scheme with random samples
propagation, which involves end-to-end training with asymptotic linear
computational cost to the input size. To further reduce computational cost and
speed up convergence, we apply the coarse-to-fine strategy on Embedded
PatchMacth constituting CFE-PatchMatch. To fully leverage reference information
across multiple scales and enhance robustness to scale misalignment, we develop
the MSDA module consisting of Dynamic Aggregation and Multi-Scale Aggregation.
The Dynamic Aggregation corrects minor scale misalignment by dynamically
aggregating features, and the Multi-Scale Aggregation brings robustness to
large scale misalignment by fusing multi-scale information. Experimental
results show that the proposed AMSA achieves superior performance over
state-of-the-art approaches on both quantitative and qualitative evaluations.
Related papers
- Efficient Image Super-Resolution with Multi-Scale Spatial Adaptive Attention Networks [3.4782736103257323]
This paper introduces a lightweight image super-resolution (SR) network, termed the Multi-scale Spatial Adaptive Attention Network (MSAAN)<n>The core of our approach is a novel Multi-scale Spatial Adaptive Attention Module (MSAA), designed to jointly model fine-grained local details and long-range contextual dependencies.
arXiv Detail & Related papers (2026-02-22T07:47:39Z) - Parallel Diffusion Solver via Residual Dirichlet Policy Optimization [88.7827307535107]
Diffusion models (DMs) have achieved state-of-the-art generative performance but suffer from high sampling latency due to their sequential denoising nature.<n>Existing solver-based acceleration methods often face significant image quality degradation under a low-dimensional budget.<n>We propose the Ensemble Parallel Direction solver (dubbed as EPD-EPr), a novel ODE solver that mitigates these errors by incorporating multiple gradient parallel evaluations in each step.
arXiv Detail & Related papers (2025-12-28T05:48:55Z) - Beyond Real Weights: Hypercomplex Representations for Stable Quantization [6.708338010963415]
Multimodal language models (MLLMs) require large parameter capacity to align high-dimensional visual features with linguistic representations.<n>We introduce a progressive re parameterization strategy that compresses these models by gradually replacing dense feed-forward network blocks.<n>A residual schedule, together with lightweight reconstruction and knowledge distillation losses, ensures that the PHM modules inherit the functional behavior of their dense counterparts during training.
arXiv Detail & Related papers (2025-12-09T12:10:57Z) - Mixture of Ranks with Degradation-Aware Routing for One-Step Real-World Image Super-Resolution [76.66229730098759]
In real-world image super-resolution (Real-ISR), existing approaches mainly rely on fine-tuning pre-trained diffusion models.<n>We propose a Mixture-of-Ranks (MoR) architecture for single-step image super-resolution.<n>We introduce a fine-grained expert partitioning strategy that treats each rank in LoRA as an independent expert.
arXiv Detail & Related papers (2025-11-20T04:11:44Z) - UniMRSeg: Unified Modality-Relax Segmentation via Hierarchical Self-Supervised Compensation [104.59740403500132]
Multi-modal image segmentation faces real-world deployment challenges from incomplete/corrupted modalities degrading performance.<n>We propose a unified modality-relax segmentation network (UniMRSeg) through hierarchical self-supervised compensation (HSSC)<n>Our approach hierarchically bridges representation gaps between complete and incomplete modalities across input, feature and output levels.
arXiv Detail & Related papers (2025-09-19T17:29:25Z) - Your Super Resolution Model is not Enough for Tackling Real-World Scenarios [2.101267270902429]
We propose a plug-in Scale-Aware Attention Module (SAAM) designed to retrofit modern fixed-scale SR models with the ability to perform arbitrary-scale SR.<n>SAAM employs lightweight, scale-adaptive feature extraction and upsampling, incorporating the Simple parameter-free Attention Module (SimAM) for efficient guidance and gradient variance loss.<n>Our method integrates seamlessly into multiple state-of-the-art SR backbones, delivering competitive or superior performance across a wide range of integer and non-integer scale factors.
arXiv Detail & Related papers (2025-09-08T07:13:58Z) - QuantVSR: Low-Bit Post-Training Quantization for Real-World Video Super-Resolution [53.13952833016505]
We propose a low-bit quantization model for real-world video super-resolution (VSR)<n>We use a calibration dataset to measure both spatial and temporal complexity for each layer.<n>We refine the FP and low-bit branches to achieve simultaneous optimization.
arXiv Detail & Related papers (2025-08-06T14:35:59Z) - IM-LUT: Interpolation Mixing Look-Up Tables for Image Super-Resolution [21.982964666527646]
Look-up table (LUT)-based approaches have attracted interest due to their efficiency and performance.<n>Existing ASISR techniques often employ implicit neural representations, which come with considerable computational cost and memory demands.<n>We propose Interpolation Mixing LUT (IM-LUT), a novel framework that operates ASISR by learning to blend multiple functions to maximize their capacity.
arXiv Detail & Related papers (2025-07-14T05:02:57Z) - Grid-Reg: Detector-Free Gridized Feature Learning and Matching for Large-Scale SAR-Optical Image Registration [22.80821597640134]
It is highly challenging to register large-scale, heterogeneous SAR and optical images, particularly across platforms.<n>To overcome these challenges, we propose Grid-Reg, a grid-based multimodal registration framework.<n>Our proposed approach achieves superior performance over state-of-the-art methods.
arXiv Detail & Related papers (2025-07-06T03:43:18Z) - MISCGrasp: Leveraging Multiple Integrated Scales and Contrastive Learning for Enhanced Volumetric Grasping [15.127239823566194]
MISCGrasp is a volumetric grasping method that integrates multi-scale feature extraction with contrastive feature enhancement for self-adaptive grasping.<n>We propose a query-based interaction between high-level and low-level features through the Insight Transformer, while the Empower Transformer selectively attends to the highest-level features.<n>Extensive experiments in both simulated and real-world environments demonstrate that MISCGrasp outperforms baseline and variant methods in tabletop decluttering tasks.
arXiv Detail & Related papers (2025-07-03T14:36:45Z) - Structural Similarity-Inspired Unfolding for Lightweight Image Super-Resolution [88.20464308588889]
We propose a Structural Similarity-Inspired Unfolding (SSIU) method for efficient image SR.<n>This method is designed through unfolding an SR optimization function constrained by structural similarity.<n>Our model outperforms current state-of-the-art models, boasting lower parameter counts and reduced memory consumption.
arXiv Detail & Related papers (2025-06-13T14:29:40Z) - RMoA: Optimizing Mixture-of-Agents through Diversity Maximization and Residual Compensation [6.364685086217188]
We propose Residual Mixture-of-Agents (RMoA) to integrate residual connections to optimize efficiency and reliability.<n>RMoA achieves state-of-the-art performance on the benchmarks of across alignment, mathematical reasoning, code generation, and multitasking understanding.
arXiv Detail & Related papers (2025-05-30T10:23:11Z) - PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution [87.89013794655207]
Diffusion-based image super-resolution (SR) models have shown superior performance at the cost of multiple denoising steps.
We propose a novel post-training quantization approach with adaptive scale in one-step diffusion (OSD) image SR, PassionSR.
Our PassionSR achieves significant advantages over recent leading low-bit quantization methods for image SR.
arXiv Detail & Related papers (2024-11-26T04:49:42Z) - Adaptive Multi-Scale Decomposition Framework for Time Series Forecasting [26.141054975797868]
We propose a novel Adaptive Multi-Scale Decomposition (AMD) framework for time series forecasting (TSF)
Our framework decomposes time series into distinct temporal patterns at multiple scales, leveraging the Multi-Scale Decomposable Mixing (MDM) block.
Our approach effectively models both temporal and channel dependencies and utilizes autocorrelation to refine multi-scale data integration.
arXiv Detail & Related papers (2024-06-06T05:27:33Z) - Cross-Domain Knowledge Distillation for Low-Resolution Human Pose Estimation [31.970739018426645]
In practical applications of human pose estimation, low-resolution inputs frequently occur, and existing state-of-the-art models perform poorly with low-resolution images.
This work focuses on boosting the performance of low-resolution models by distilling knowledge from a high-resolution model.
arXiv Detail & Related papers (2024-05-19T04:57:17Z) - Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR)
CFSR inherits the advantages of both convolution-based and transformer-based approaches.
Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z) - Can SAM Boost Video Super-Resolution? [78.29033914169025]
We propose a simple yet effective module -- SAM-guidEd refinEment Module (SEEM)
This light-weight plug-in module is specifically designed to leverage the attention mechanism for the generation of semantic-aware feature.
We apply our SEEM to two representative methods, EDVR and BasicVSR, resulting in consistently improved performance with minimal implementation effort.
arXiv Detail & Related papers (2023-05-11T02:02:53Z) - A Unifying Multi-sampling-ratio CS-MRI Framework With Two-grid-cycle
Correction and Geometric Prior Distillation [7.643154460109723]
We propose a unifying deep unfolding multi-sampling-ratio CS-MRI framework, by merging advantages of model-based and deep learning-based methods.
Inspired by multigrid algorithm, we first embed the CS-MRI-based optimization algorithm into correction-distillation scheme.
We employ a condition module to learn adaptively step-length and noise level from compressive sampling ratio in every stage.
arXiv Detail & Related papers (2022-05-14T13:36:27Z) - Modal-Adaptive Gated Recoding Network for RGB-D Salient Object Detection [2.9153096940947796]
We propose a novel gated recoding network (GRNet) to evaluate the information validity of the two modes.
A perception encoder is adopted to extract multi-level single-modal features.
A modal-adaptive gate unit is proposed to suppress the invalid information and transfer the effective modal features to the recoding mixer and the hybrid branch decoder.
arXiv Detail & Related papers (2021-08-13T15:08:21Z) - Reinforcement Learning for Adaptive Mesh Refinement [63.7867809197671]
We propose a novel formulation of AMR as a Markov decision process and apply deep reinforcement learning to train refinement policies directly from simulation.
The model sizes of these policy architectures are independent of the mesh size and hence scale to arbitrarily large and complex simulations.
arXiv Detail & Related papers (2021-03-01T22:55:48Z) - Crowd Counting via Hierarchical Scale Recalibration Network [61.09833400167511]
We propose a novel Hierarchical Scale Recalibration Network (HSRNet) to tackle the task of crowd counting.
HSRNet models rich contextual dependencies and recalibrating multiple scale-associated information.
Our approach can ignore various noises selectively and focus on appropriate crowd scales automatically.
arXiv Detail & Related papers (2020-03-07T10:06:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.