DSConv: Dynamic Splitting Convolution for Pansharpening
- URL: http://arxiv.org/abs/2508.06147v2
- Date: Fri, 15 Aug 2025 07:51:11 GMT
- Title: DSConv: Dynamic Splitting Convolution for Pansharpening
- Authors: Xuanyu Liu, Bonan An,
- Abstract summary: We propose a novel strategy for dynamically splitting convolution kernels in conjunction with attention, selecting positions of interest, and splitting the original convolution kernel into multiple smaller kernels, named DSConv.<n>The proposed DSConv more effectively extracts features of different positions within the receptive field, enhancing the network's generalization, optimization, and feature representation capabilities.
- Score: 1.2882440720152197
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Aiming to obtain a high-resolution image, pansharpening involves the fusion of a multi-spectral image (MS) and a panchromatic image (PAN), the low-level vision task remaining significant and challenging in contemporary research. Most existing approaches rely predominantly on standard convolutions, few making the effort to adaptive convolutions, which are effective owing to the inter-pixel correlations of remote sensing images. In this paper, we propose a novel strategy for dynamically splitting convolution kernels in conjunction with attention, selecting positions of interest, and splitting the original convolution kernel into multiple smaller kernels, named DSConv. The proposed DSConv more effectively extracts features of different positions within the receptive field, enhancing the network's generalization, optimization, and feature representation capabilities. Furthermore, we innovate and enrich concepts of dynamic splitting convolution and provide a novel network architecture for pansharpening capable of achieving the tasks more efficiently, building upon this methodology. Adequate fair experiments illustrate the effectiveness and the state-of-the-art performance attained by DSConv.Comprehensive and rigorous discussions proved the superiority and optimal usage conditions of DSConv.
Related papers
- PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning [50.21619363035618]
We propose a general reinforcement learning approach PeRL tailored for interleaved multimodal tasks.<n>We introduce permutation of image sequences to simulate varied positional relationships to explore more spatial and positional diversity.<n>Our experiments confirm that PeRL trained model consistently surpasses R1-related and interleaved VLM baselines by a large margin.
arXiv Detail & Related papers (2025-06-17T18:25:56Z) - Multi-Head Attention Driven Dynamic Visual-Semantic Embedding for Enhanced Image-Text Matching [0.8611782340880084]
This study proposes an innovative visual semantic embedding model, Multi-Headed Consensus-Aware Visual-Semantic Embedding (MH-CVSE)<n>This model introduces a multi-head self-attention mechanism based on the consensus-aware visual semantic embedding model (CVSE) to capture information in multiple subspaces in parallel.<n>In terms of loss function design, the MH-CVSE model adopts a dynamic weight adjustment strategy to dynamically adjust the weight according to the loss value itself.
arXiv Detail & Related papers (2024-12-26T11:46:22Z) - MMDRFuse: Distilled Mini-Model with Dynamic Refresh for Multi-Modality Image Fusion [32.38584862347954]
A lightweight Distilled Mini-Model with a Dynamic Refresh strategy (MMDRFuse) is proposed to achieve this objective.
To pursue model parsimony, an extremely small convolutional network with a total of 113 trainable parameters (0.44 KB) is obtained.
Experiments on several public datasets demonstrate that our method exhibits promising advantages in terms of model efficiency and complexity.
arXiv Detail & Related papers (2024-08-28T08:52:33Z) - Learning Image Deraining Transformer Network with Dynamic Dual
Self-Attention [46.11162082219387]
This paper proposes an effective image deraining Transformer with dynamic dual self-attention (DDSA)
Specifically, we only select the most useful similarity values based on top-k approximate calculation to achieve sparse attention.
In addition, we also develop a novel spatial-enhanced feed-forward network (SEFN) to further obtain a more accurate representation for achieving high-quality derained results.
arXiv Detail & Related papers (2023-08-15T13:59:47Z) - Rank-Enhanced Low-Dimensional Convolution Set for Hyperspectral Image
Denoising [50.039949798156826]
This paper tackles the challenging problem of hyperspectral (HS) image denoising.
We propose rank-enhanced low-dimensional convolution set (Re-ConvSet)
We then incorporate Re-ConvSet into the widely-used U-Net architecture to construct an HS image denoising method.
arXiv Detail & Related papers (2022-07-09T13:35:12Z) - High-Quality Pluralistic Image Completion via Code Shared VQGAN [51.7805154545948]
We present a novel framework for pluralistic image completion that can achieve both high quality and diversity at much faster inference speed.
Our framework is able to learn semantically-rich discrete codes efficiently and robustly, resulting in much better image reconstruction quality.
arXiv Detail & Related papers (2022-04-05T01:47:35Z) - Image-specific Convolutional Kernel Modulation for Single Image
Super-resolution [85.09413241502209]
In this issue, we propose a novel image-specific convolutional modulation kernel (IKM)
We exploit the global contextual information of image or feature to generate an attention weight for adaptively modulating the convolutional kernels.
Experiments on single image super-resolution show that the proposed methods achieve superior performances over state-of-the-art methods.
arXiv Detail & Related papers (2021-11-16T11:05:10Z) - DWDN: Deep Wiener Deconvolution Network for Non-Blind Image Deblurring [66.91879314310842]
We propose an explicit deconvolution process in a feature space by integrating a classical Wiener deconvolution framework with learned deep features.
A multi-scale cascaded feature refinement module then predicts the deblurred image from the deconvolved deep features.
We show that the proposed deep Wiener deconvolution network facilitates deblurred results with visibly fewer artifacts and quantitatively outperforms state-of-the-art non-blind image deblurring methods by a wide margin.
arXiv Detail & Related papers (2021-03-18T00:38:11Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.