Exploring Multi-Scale Feature Propagation and Communication for Image
Super Resolution
- URL: http://arxiv.org/abs/2008.00239v2
- Date: Fri, 14 Aug 2020 08:03:17 GMT
- Title: Exploring Multi-Scale Feature Propagation and Communication for Image
Super Resolution
- Authors: Ruicheng Feng, Weipeng Guan, Yu Qiao, Chao Dong
- Abstract summary: We present a unified formulation over widely-used multi-scale structures.
We propose a generic and efficient multi-scale convolution unit -- Multi-Scale cross-Scale Share-weights convolution (MS$3$-Conv)
- Score: 37.91175933401261
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-scale techniques have achieved great success in a wide range of
computer vision tasks. However, while this technique is incorporated in
existing works, there still lacks a comprehensive investigation on variants of
multi-scale convolution in image super resolution. In this work, we present a
unified formulation over widely-used multi-scale structures. With this
framework, we systematically explore the two factors of multi-scale convolution
-- feature propagation and cross-scale communication. Based on the
investigation, we propose a generic and efficient multi-scale convolution unit
-- Multi-Scale cross-Scale Share-weights convolution (MS$^3$-Conv). Extensive
experiments demonstrate that the proposed MS$^3$-Conv can achieve better SR
performance than the standard convolution with less parameters and
computational cost. Beyond quantitative analysis, we comprehensively study the
visual quality, which shows that MS$^3$-Conv behave better to recover
high-frequency details.
Related papers
- MSVM-UNet: Multi-Scale Vision Mamba UNet for Medical Image Segmentation [3.64388407705261]
We propose a Multi-Scale Vision Mamba UNet model for medical image segmentation, termed MSVM-UNet.
Specifically, by introducing multi-scale convolutions in the VSS blocks, we can more effectively capture and aggregate multi-scale feature representations from the hierarchical features of the VMamba encoder.
arXiv Detail & Related papers (2024-08-25T06:20:28Z) - Implicit Grid Convolution for Multi-Scale Image Super-Resolution [6.8410780175245165]
We propose a multi-scale framework that employs a single encoder in conjunction with Implicit Grid Convolution (IGConv)
Our framework achieves comparable performance to existing fixed-scale methods while reducing the training budget and stored parameters three-fold.
arXiv Detail & Related papers (2024-08-19T03:30:15Z) - Multi-Scale Implicit Transformer with Re-parameterize for
Arbitrary-Scale Super-Resolution [2.4865475189445405]
Multi-Scale Implicit Transformer (MSIT)
MSIT consists of an Multi-scale Neural Operator (MSNO) and Multi-Scale Self-Attention (MSSA)
arXiv Detail & Related papers (2024-03-11T09:23:20Z) - Joint Depth Prediction and Semantic Segmentation with Multi-View SAM [59.99496827912684]
We propose a Multi-View Stereo (MVS) technique for depth prediction that benefits from rich semantic features of the Segment Anything Model (SAM)
This enhanced depth prediction, in turn, serves as a prompt to our Transformer-based semantic segmentation decoder.
arXiv Detail & Related papers (2023-10-31T20:15:40Z) - Exploiting Modality-Specific Features For Multi-Modal Manipulation
Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks.
Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment.
We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z) - General-Purpose Multimodal Transformer meets Remote Sensing Semantic
Segmentation [35.100738362291416]
Multimodal AI seeks to exploit complementary data sources, particularly for complex tasks like semantic segmentation.
Recent trends in general-purpose multimodal networks have shown great potential to achieve state-of-the-art performance.
We propose a UNet-inspired module that employs 3D convolution to encode vital local information and learn cross-modal features simultaneously.
arXiv Detail & Related papers (2023-07-07T04:58:34Z) - Deep Diversity-Enhanced Feature Representation of Hyperspectral Images [87.47202258194719]
We rectify 3D convolution by modifying its topology to enhance the rank upper-bound.
We also propose a novel diversity-aware regularization (DA-Reg) term that acts on the feature maps to maximize independence among elements.
To demonstrate the superiority of the proposed Re$3$-ConvSet and DA-Reg, we apply them to various HS image processing and analysis tasks.
arXiv Detail & Related papers (2023-01-15T16:19:18Z) - Rich CNN-Transformer Feature Aggregation Networks for Super-Resolution [50.10987776141901]
Recent vision transformers along with self-attention have achieved promising results on various computer vision tasks.
We introduce an effective hybrid architecture for super-resolution (SR) tasks, which leverages local features from CNNs and long-range dependencies captured by transformers.
Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.
arXiv Detail & Related papers (2022-03-15T06:52:25Z) - Cross-MPI: Cross-scale Stereo for Image Super-Resolution using
Multiplane Images [44.85260985973405]
Cross-MPI is an end-to-end RefSR network composed of a novel plane-aware MPI mechanism, a multiscale guided upsampling module and a super-resolution synthesis and fusion module.
Experimental results on both digitally synthesized and optical zoom cross-scale data show that the Cross-MPI framework can achieve superior performance against the existing RefSR methods.
arXiv Detail & Related papers (2020-11-30T09:14:07Z) - Sequential Hierarchical Learning with Distribution Transformation for
Image Super-Resolution [83.70890515772456]
We build a sequential hierarchical learning super-resolution network (SHSR) for effective image SR.
We consider the inter-scale correlations of features, and devise a sequential multi-scale block (SMB) to progressively explore the hierarchical information.
Experiment results show SHSR achieves superior quantitative performance and visual quality to state-of-the-art methods.
arXiv Detail & Related papers (2020-07-19T01:35:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.