A Cascaded Information Interaction Network for Precise Image Segmentation
- URL: http://arxiv.org/abs/2601.00562v1
- Date: Fri, 02 Jan 2026 04:33:03 GMT
- Title: A Cascaded Information Interaction Network for Precise Image Segmentation
- Authors: Hewen Xiao, Jie Mei, Guangfu Ma, Weiren Wu,
- Abstract summary: This paper proposes a cascaded convolutional neural network integrated with a novel Global Information Guidance Module.<n>This module is designed to effectively fuse low-level texture details with high-level semantic features across multiple layers.<n>This architectural innovation significantly enhances segmentation accuracy, particularly in visually cluttered or blurred environments.
- Score: 3.911594181651384
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual perception plays a pivotal role in enabling autonomous behavior, offering a cost-effective and efficient alternative to complex multi-sensor systems. However, robust segmentation remains a challenge in complex scenarios. To address this, this paper proposes a cascaded convolutional neural network integrated with a novel Global Information Guidance Module. This module is designed to effectively fuse low-level texture details with high-level semantic features across multiple layers, thereby overcoming the inherent limitations of single-scale feature extraction. This architectural innovation significantly enhances segmentation accuracy, particularly in visually cluttered or blurred environments where traditional methods often fail. Experimental evaluations on benchmark image segmentation datasets demonstrate that the proposed framework achieves superior precision, outperforming existing state-of-the-art methods. The results highlight the effectiveness of the approach and its promising potential for deployment in practical robotic applications.
Related papers
- RS-ISRefiner: Towards Better Adapting Vision Foundation Models for Interactive Segmentation of Remote Sensing Images [17.648922817109224]
RS-ISRefiner is a novel click-based IIS framework tailored for remote sensing images.<n>It consistently outperforms state-of-the-art IIS methods in terms of segmentation accuracy, efficiency and interaction cost.
arXiv Detail & Related papers (2025-11-30T04:12:43Z) - GCRPNet: Graph-Enhanced Contextual and Regional Perception Network for Salient Object Detection in Optical Remote Sensing Images [68.33481681452675]
We propose a graph-enhanced contextual and regional perception network (GCRPNet)<n>It builds upon the Mamba architecture to simultaneously capture long-range dependencies and enhance regional feature representation.<n>It performs adaptive patch scanning on feature maps processed via multi-scale convolutions, thereby capturing rich local region information.
arXiv Detail & Related papers (2025-08-14T11:31:43Z) - HRSeg: High-Resolution Visual Perception and Enhancement for Reasoning Segmentation [74.1872891313184]
HRSeg is an efficient model with high-resolution fine-grained perception.<n>It features two key innovations: High-Resolution Perception (HRP) and High-Resolution Enhancement (HRE)
arXiv Detail & Related papers (2025-07-17T08:09:31Z) - Brain Inspired Adaptive Memory Dual-Net for Few-Shot Image Classification [10.824399627455326]
Existing methods aim to tackle this problem by locating and aligning relevant local features.<n>High intra-class variability in real-world images poses significant challenges in locating semantically relevant local regions under few-shot settings.<n>We propose the generalization-optimized Systems Consolidation Adaptive Memory Dual-Network, SCAM-Net.
arXiv Detail & Related papers (2025-03-10T14:42:51Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Learning Image Deraining Transformer Network with Dynamic Dual
Self-Attention [46.11162082219387]
This paper proposes an effective image deraining Transformer with dynamic dual self-attention (DDSA)
Specifically, we only select the most useful similarity values based on top-k approximate calculation to achieve sparse attention.
In addition, we also develop a novel spatial-enhanced feed-forward network (SEFN) to further obtain a more accurate representation for achieving high-quality derained results.
arXiv Detail & Related papers (2023-08-15T13:59:47Z) - Learning to Generate Training Datasets for Robust Semantic Segmentation [37.9308918593436]
We propose a novel approach to improve the robustness of semantic segmentation techniques.
We design Robusta, a novel conditional generative adversarial network to generate realistic and plausible perturbed images.
Our results suggest that this approach could be valuable in safety-critical applications.
arXiv Detail & Related papers (2023-08-01T10:02:26Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Progressive Self-Guided Loss for Salient Object Detection [102.35488902433896]
We present a progressive self-guided loss function to facilitate deep learning-based salient object detection in images.
Our framework takes advantage of adaptively aggregated multi-scale features to locate and detect salient objects effectively.
arXiv Detail & Related papers (2021-01-07T07:33:38Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z) - Contextual Encoder-Decoder Network for Visual Saliency Prediction [42.047816176307066]
We propose an approach based on a convolutional neural network pre-trained on a large-scale image classification task.
We combine the resulting representations with global scene information for accurately predicting visual saliency.
Compared to state of the art approaches, the network is based on a lightweight image classification backbone.
arXiv Detail & Related papers (2019-02-18T16:15:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.