Related papers: A Deep Semantic Segmentation Network with Semantic and Contextual Refinements

A Deep Semantic Segmentation Network with Semantic and Contextual Refinements

URL: http://arxiv.org/abs/2412.08671v1
Date: Wed, 11 Dec 2024 03:40:46 GMT
Title: A Deep Semantic Segmentation Network with Semantic and Contextual Refinements
Authors: Zhiyan Wang, Deyin Liu, Lin Yuanbo Wu, Song Wang, Xin Guo, Lin Qi,
Abstract summary: This paper introduces a Semantic Refinement Module (SRM) to address this issue within the segmentation network.<n>A Contextual Refinement Module (CRM) is presented to capture global context information across both spatial and channel dimensions.<n>The efficacy of these proposed modules is validated on three widely used datasets-Cityscapes, Bdd100K, and ADE20K-demonstrating superior performance compared to state-of-the-art methods.
Score: 11.755865577258767
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Semantic segmentation is a fundamental task in multimedia processing, which can be used for analyzing, understanding, editing contents of images and videos, among others. To accelerate the analysis of multimedia data, existing segmentation researches tend to extract semantic information by progressively reducing the spatial resolutions of feature maps. However, this approach introduces a misalignment problem when restoring the resolution of high-level feature maps. In this paper, we design a Semantic Refinement Module (SRM) to address this issue within the segmentation network. Specifically, SRM is designed to learn a transformation offset for each pixel in the upsampled feature maps, guided by high-resolution feature maps and neighboring offsets. By applying these offsets to the upsampled feature maps, SRM enhances the semantic representation of the segmentation network, particularly for pixels around object boundaries. Furthermore, a Contextual Refinement Module (CRM) is presented to capture global context information across both spatial and channel dimensions. To balance dimensions between channel and space, we aggregate the semantic maps from all four stages of the backbone to enrich channel context information. The efficacy of these proposed modules is validated on three widely used datasets-Cityscapes, Bdd100K, and ADE20K-demonstrating superior performance compared to state-of-the-art methods. Additionally, this paper extends these modules to a lightweight segmentation network, achieving an mIoU of 82.5% on the Cityscapes validation set with only 137.9 GFLOPs.

Related papers

FANet: Feature Amplification Network for Semantic Segmentation in Cluttered Background [9.970265640589966]
Existing deep learning approaches leave out the semantic cues that are crucial in semantic segmentation present in complex scenarios. We propose a feature amplification network (FANet) as a backbone network that incorporates semantic information using a novel feature enhancement module at multi-stages. Our experimental results demonstrate the state-of-the-art performance compared to existing methods.
arXiv Detail & Related papers (2024-07-12T15:57:52Z)
Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing. Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery. We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z)
Hi-ResNet: Edge Detail Enhancement for High-Resolution Remote Sensing Segmentation [10.919956120261539]
High-resolution remote sensing (HRS) semantic segmentation extracts key objects from high-resolution coverage areas. objects of the same category within HRS images show significant differences in scale and shape across diverse geographical environments. We propose a High-resolution remote sensing network (Hi-ResNet) with efficient network structure designs.
arXiv Detail & Related papers (2023-05-22T03:58:25Z)
Few-shot Segmentation with Optimal Transport Matching and Message Flow [50.9853556696858]
It is essential for few-shot semantic segmentation to fully utilize the support information. We propose a Correspondence Matching Network (CMNet) with an Optimal Transport Matching module. Experiments on PASCAL VOC 2012, MS COCO, and FSS-1000 datasets show that our network achieves new state-of-the-art few-shot segmentation performance.
arXiv Detail & Related papers (2021-08-19T06:26:11Z)
Residual Moment Loss for Medical Image Segmentation [56.72261489147506]
Location information is proven to benefit the deep learning models on capturing the manifold structure of target objects. Most existing methods encode the location information in an implicit way, for the network to learn. We propose a novel loss function, namely residual moment (RM) loss, to explicitly embed the location information of segmentation targets.
arXiv Detail & Related papers (2021-06-27T09:31:49Z)
CSRNet: Cascaded Selective Resolution Network for Real-time Semantic Segmentation [18.63596070055678]
We propose a light Cascaded Selective Resolution Network (CSRNet) to improve the performance of real-time segmentation. The proposed network builds a three-stage segmentation system, which integrates feature information from low resolution to high resolution. Experiments on two well-known datasets demonstrate that the proposed CSRNet effectively improves the performance for real-time segmentation.
arXiv Detail & Related papers (2021-06-08T14:22:09Z)
CTNet: Context-based Tandem Network for Semantic Segmentation [77.4337867789772]
This work proposes a novel Context-based Tandem Network (CTNet) by interactively exploring the spatial contextual information and the channel contextual information. To further improve the performance of the learned representations for semantic segmentation, the results of the two context modules are adaptively integrated.
arXiv Detail & Related papers (2021-04-20T07:33:11Z)
Dual Attention GANs for Semantic Image Synthesis [101.36015877815537]
We propose a novel Dual Attention GAN (DAGAN) to synthesize photo-realistic and semantically-consistent images. We also propose two novel modules, i.e., position-wise Spatial Attention Module (SAM) and scale-wise Channel Attention Module (CAM) DAGAN achieves remarkably better results than state-of-the-art methods, while using fewer model parameters.
arXiv Detail & Related papers (2020-08-29T17:49:01Z)
Improving Semantic Segmentation via Decoupled Body and Edge Supervision [89.57847958016981]
Existing semantic segmentation approaches either aim to improve the object's inner consistency by modeling the global context, or refine objects detail along their boundaries by multi-scale feature fusion. In this paper, a new paradigm for semantic segmentation is proposed. Our insight is that appealing performance of semantic segmentation requires textitexplicitly modeling the object textitbody and textitedge, which correspond to the high and low frequency of the image. We show that the proposed framework with various baselines or backbone networks leads to better object inner consistency and object boundaries.
arXiv Detail & Related papers (2020-07-20T12:11:22Z)
AlignSeg: Feature-Aligned Segmentation Networks [109.94809725745499]
We propose Feature-Aligned Networks (AlignSeg) to address misalignment issues during the feature aggregation process. Our network achieves new state-of-the-art mIoU scores of 82.6% and 45.95%, respectively.
arXiv Detail & Related papers (2020-02-24T10:00:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.