Deep Attention Unet: A Network Model with Global Feature Perception
Ability
- URL: http://arxiv.org/abs/2304.10829v2
- Date: Wed, 17 Jan 2024 06:37:18 GMT
- Title: Deep Attention Unet: A Network Model with Global Feature Perception
Ability
- Authors: Jiacheng Li
- Abstract summary: This paper proposes a new type of UNet image segmentation algorithm based on channel self attention mechanism and residual connection called.
In my experiment, the new network model improved mIOU by 2.48% compared to traditional UNet on the FoodNet dataset.
- Score: 12.087640144194246
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Remote sensing image segmentation is a specific task of remote sensing image
interpretation. A good remote sensing image segmentation algorithm can provide
guidance for environmental protection, agricultural production, and urban
construction. This paper proposes a new type of UNet image segmentation
algorithm based on channel self attention mechanism and residual connection
called . In my experiment, the new network model improved mIOU by 2.48%
compared to traditional UNet on the FoodNet dataset. The image segmentation
algorithm proposed in this article enhances the internal connections between
different items in the image, thus achieving better segmentation results for
remote sensing images with occlusion.
Related papers
- DDU-Net: A Domain Decomposition-based CNN for High-Resolution Image Segmentation on Multiple GPUs [46.873264197900916]
A domain decomposition-based U-Net architecture is introduced, which partitions input images into non-overlapping patches.
A communication network is added to facilitate inter-patch information exchange to enhance the understanding of spatial context.
Results show that the approach achieves a $2-3,%$ higher intersection over union (IoU) score compared to the same network without inter-patch communication.
arXiv Detail & Related papers (2024-07-31T01:07:21Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Self-Correlation and Cross-Correlation Learning for Few-Shot Remote
Sensing Image Semantic Segmentation [27.59330408178435]
Few-shot remote sensing semantic segmentation aims at learning to segment target objects from a query image.
We propose a Self-Correlation and Cross-Correlation Learning Network for the few-shot remote sensing image semantic segmentation.
Our model enhances the generalization by considering both self-correlation and cross-correlation between support and query images.
arXiv Detail & Related papers (2023-09-11T21:53:34Z) - EAA-Net: Rethinking the Autoencoder Architecture with Intra-class
Features for Medical Image Segmentation [4.777011444412729]
We propose a light-weight end-to-end segmentation framework based on multi-task learning, termed Edge Attention autoencoder Network (EAA-Net)
Our approach not only utilizes the segmentation network to obtain inter-class features, but also applies the reconstruction network to extract intra-class features among the foregrounds.
Experimental results show that our method performs well in medical image segmentation tasks.
arXiv Detail & Related papers (2022-08-19T07:42:55Z) - Looking Outside the Window: Wider-Context Transformer for the Semantic
Segmentation of High-Resolution Remote Sensing Images [18.161847218988964]
We propose a Wider-Context Network (WiCNet) for the semantic segmentation of High-Resolution (HR) Remote Sensing Images (RSIs)
In the WiCNet, apart from a conventional feature extraction network, an extra context branch is designed to explicitly model the context information in a larger image area.
The information between the two branches is communicated through a Context Transformer, which is a novel design derived from the Vision Transformer to model the long-range context correlations.
arXiv Detail & Related papers (2021-06-29T23:41:54Z) - Guided Interactive Video Object Segmentation Using Reliability-Based
Attention Maps [55.94785248905853]
We propose a novel guided interactive segmentation (GIS) algorithm for video objects to improve the segmentation accuracy and reduce the interaction time.
We develop the intersection-aware propagation module to propagate segmentation results to neighboring frames.
Experimental results demonstrate that the proposed algorithm provides more accurate segmentation results at a faster speed than conventional algorithms.
arXiv Detail & Related papers (2021-04-21T07:08:57Z) - SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from
Monocular images [94.36401543589523]
We introduce the concept of semantic objectness to exploit the geometric relationship of these two tasks.
We then propose a Semantic Object and Depth Estimation Network (SOSD-Net) based on the objectness assumption.
To the best of our knowledge, SOSD-Net is the first network that exploits the geometry constraint for simultaneous monocular depth estimation and semantic segmentation.
arXiv Detail & Related papers (2021-01-19T02:41:03Z) - Super-Resolution Domain Adaptation Networks for Semantic Segmentation
via Pixel and Output Level Aligning [4.500622871756055]
This paper designs a novel end-to-end semantic segmentation network, namely Super-Resolution Domain Adaptation Network (SRDA-Net)
SRDA-Net can simultaneously achieve the super-resolution task and the domain adaptation task, thus satisfying the requirement of semantic segmentation for remote sensing images.
Experimental results on two remote sensing datasets with different resolutions demonstrate that SRDA-Net performs favorably against some state-of-the-art methods.
arXiv Detail & Related papers (2020-05-13T15:48:41Z) - CRNet: Cross-Reference Networks for Few-Shot Segmentation [59.85183776573642]
Few-shot segmentation aims to learn a segmentation model that can be generalized to novel classes with only a few training images.
With a cross-reference mechanism, our network can better find the co-occurrent objects in the two images.
Experiments on the PASCAL VOC 2012 dataset show that our network achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-03-24T04:55:43Z) - Unsupervised Bidirectional Cross-Modality Adaptation via Deeply
Synergistic Image and Feature Alignment for Medical Image Segmentation [73.84166499988443]
We present a novel unsupervised domain adaptation framework, named as Synergistic Image and Feature Alignment (SIFA)
Our proposed SIFA conducts synergistic alignment of domains from both image and feature perspectives.
Experimental results on two different tasks demonstrate that our SIFA method is effective in improving segmentation performance on unlabeled target images.
arXiv Detail & Related papers (2020-02-06T13:49:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.