Enhancing Feature Fusion of U-like Networks with Dynamic Skip Connections
- URL: http://arxiv.org/abs/2509.14610v4
- Date: Mon, 27 Oct 2025 02:31:50 GMT
- Title: Enhancing Feature Fusion of U-like Networks with Dynamic Skip Connections
- Authors: Yue Cao, Quansong He, Kaishen Wang, Jianlong Xiong, Zhang Yi, Tao He,
- Abstract summary: U-like networks have become fundamental frameworks in medical image segmentation through skip connections.<n>Traditional skip connections exhibit two key limitations: inter-feature constraints and intra-feature constraints.<n>We propose a novel Dynamic Skip Connection (DSC) block that fundamentally enhances cross-layer connectivity through adaptive mechanisms.
- Score: 21.996051497981103
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: U-like networks have become fundamental frameworks in medical image segmentation through skip connections that bridge high-level semantics and low-level spatial details. Despite their success, conventional skip connections exhibit two key limitations: inter-feature constraints and intra-feature constraints. The inter-feature constraint refers to the static nature of feature fusion in traditional skip connections, where information is transmitted along fixed pathways regardless of feature content. The intra-feature constraint arises from the insufficient modeling of multi-scale feature interactions, thereby hindering the effective aggregation of global contextual information. To overcome these limitations, we propose a novel Dynamic Skip Connection (DSC) block that fundamentally enhances cross-layer connectivity through adaptive mechanisms. The DSC block integrates two complementary components. (1) Test-Time Training (TTT) module. This module addresses the inter-feature constraint by enabling dynamic adaptation of hidden representations during inference, facilitating content-aware feature refinement. (2) Dynamic Multi-Scale Kernel (DMSK) module. To mitigate the intra-feature constraint, this module adaptively selects kernel sizes based on global contextual cues, enhancing the network capacity for multi-scale feature integration. The DSC block is architecture-agnostic and can be seamlessly incorporated into existing U-like network structures. Extensive experiments demonstrate the plug-and-play effectiveness of the proposed DSC block across CNN-based, Transformer-based, hybrid CNN-Transformer, and Mamba-based U-like networks.
Related papers
- Multi-label Classification with Panoptic Context Aggregation Networks [61.82285737410154]
This paper introduces the Deep Panoptic Context Aggregation Network (PanCAN), a novel approach that hierarchically integrates multi-order geometric contexts.<n>PanCAN learns multi-order neighborhood relationships at each scale by combining random walks with an attention mechanism.<n>Experiments on NUS-WIDE, PASCAL VOC,2007, and MS-COCO benchmarks demonstrate that PanCAN consistently achieves competitive results.
arXiv Detail & Related papers (2025-12-29T14:16:21Z) - U-MAN: U-Net with Multi-scale Adaptive KAN Network for Medical Image Segmentation [0.6429972675128933]
Multi-scale Adaptive KAN (U-MAN) is a novel architecture that enhances the emerging Kolmogorov-Arnold Network (KAN)<n>Our PAGF module replaces the simple skip connection, using attention to fuse features from the encoder and decoder.<n>The MAN module enables the network to adaptively process features at multiple scales, improving its ability to segment objects of various sizes.
arXiv Detail & Related papers (2025-09-26T15:02:13Z) - SCRNet: Spatial-Channel Regulation Network for Medical Ultrasound Image Segmentation [1.4930126157970809]
CNN-based methods tend to disregard long-range dependencies, while Transformer-based methods may overlook local contextual information.<n>We propose a novel Feature Aggregation Module (FAM) designed to process two input features from the preceding layer.<n>This strategy enables our module to focus concurrently on both long-range dependencies and local contextual information.
arXiv Detail & Related papers (2025-08-19T15:02:27Z) - FuseUNet: A Multi-Scale Feature Fusion Method for U-like Networks [6.076351456098043]
We propose a novel multi-scale feature fusion method that reimagines the UNet decoding process as solving an initial value problem.<n> Experiments on ACDC, KiTS2023, MSD brain tumor, and ISIC skin lesion segmentation datasets demonstrate improved feature utilization, reduced network parameters, and maintained high performance.
arXiv Detail & Related papers (2025-06-06T07:34:06Z) - Selective Complementary Feature Fusion and Modal Feature Compression Interaction for Brain Tumor Segmentation [14.457627015612827]
We propose a complementary feature compression interaction network (CFCI-Net), which realizes the complementary fusion and compression interaction of multi-modal feature information.<n>CFCI-Net achieves superior results compared to state-of-the-art models.
arXiv Detail & Related papers (2025-03-20T13:52:51Z) - Dynamic Cross-Modal Feature Interaction Network for Hyperspectral and LiDAR Data Classification [66.59320112015556]
Hyperspectral image (HSI) and LiDAR data joint classification is a challenging task.<n>We propose a novel Dynamic Cross-Modal Feature Interaction Network (DCMNet)<n>Our approach introduces three feature interaction blocks: Bilinear Spatial Attention Block (BSAB), Bilinear Channel Attention Block (BCAB), and Integration Convolutional Block (ICB)
arXiv Detail & Related papers (2025-03-10T05:50:13Z) - MIETT: Multi-Instance Encrypted Traffic Transformer for Encrypted Traffic Classification [59.96233305733875]
Classifying traffic is essential for detecting security threats and optimizing network management.<n>We propose a Multi-Instance Encrypted Traffic Transformer (MIETT) to capture both token-level and packet-level relationships.<n>MIETT achieves results across five datasets, demonstrating its effectiveness in classifying encrypted traffic and understanding complex network behaviors.
arXiv Detail & Related papers (2024-12-19T12:52:53Z) - Dual Aggregation Transformer for Image Super-Resolution [92.41781921611646]
We propose a novel Transformer model, Dual Aggregation Transformer, for image SR.
Our DAT aggregates features across spatial and channel dimensions, in the inter-block and intra-block dual manner.
Our experiments show that our DAT surpasses current methods.
arXiv Detail & Related papers (2023-08-07T07:39:39Z) - RGB-D Salient Object Detection with Cross-Modality Modulation and
Selection [126.4462739820643]
We present an effective method to progressively integrate and refine the cross-modality complementarities for RGB-D salient object detection (SOD)
The proposed network mainly solves two challenging issues: 1) how to effectively integrate the complementary information from RGB image and its corresponding depth map, and 2) how to adaptively select more saliency-related features.
arXiv Detail & Related papers (2020-07-14T14:22:50Z) - AlignSeg: Feature-Aligned Segmentation Networks [109.94809725745499]
We propose Feature-Aligned Networks (AlignSeg) to address misalignment issues during the feature aggregation process.
Our network achieves new state-of-the-art mIoU scores of 82.6% and 45.95%, respectively.
arXiv Detail & Related papers (2020-02-24T10:00:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.