Attention W-Net: Improved Skip Connections for better Representations
- URL: http://arxiv.org/abs/2110.08811v1
- Date: Sun, 17 Oct 2021 12:44:36 GMT
- Title: Attention W-Net: Improved Skip Connections for better Representations
- Authors: Shikhar Mohan, Saumik Bhattacharya, Sayantari Ghosh
- Abstract summary: We propose Attention W-Net, a new U-Net based architecture for retinal vessel segmentation.
We observe an AUC and F1-Score of 0.8407 and 0.9833 - a sizeable improvement over its LadderNet backbone.
- Score: 5.027571997864707
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Segmentation of macro and microvascular structures in fundoscopic retinal
images plays a crucial role in detection of multiple retinal and systemic
diseases, yet it is a difficult problem to solve. Most deep learning approaches
for this task involve an autoencoder based architecture, but they face several
issues such as lack of enough parameters, overfitting when there are enough
parameters and incompatibility between internal feature-spaces. Due to such
issues, these techniques are hence not able to extract the best semantic
information from the limited data present for such tasks. We propose Attention
W-Net, a new U-Net based architecture for retinal vessel segmentation to
address these problems. In this architecture with a LadderNet backbone, we have
two main contributions: Attention Block and regularisation measures. Our
Attention Block uses decoder features to attend over the encoder features from
skip-connections during upsampling, resulting in higher compatibility when the
encoder and decoder features are added. Our regularisation measures include
image augmentation and modifications to the ResNet Block used, which prevent
overfitting. With these additions, we observe an AUC and F1-Score of 0.8407 and
0.9833 - a sizeable improvement over its LadderNet backbone as well as
competitive performance among the contemporary state-of-the-art methods.
Related papers
- Enhancing Retinal Vascular Structure Segmentation in Images With a Novel
Design Two-Path Interactive Fusion Module Model [6.392575673488379]
We introduce Swin-Res-Net, a specialized module designed to enhance the precision of retinal vessel segmentation.
Swin-Res-Net utilizes the Swin transformer which uses shifted windows with displacement for partitioning.
Our proposed architecture produces outstanding results, either meeting or surpassing those of other published models.
arXiv Detail & Related papers (2024-03-03T01:36:11Z) - NAC-TCN: Temporal Convolutional Networks with Causal Dilated
Neighborhood Attention for Emotion Understanding [60.74434735079253]
We propose a method known as Neighborhood Attention with Convolutions TCN (NAC-TCN)
We accomplish this by introducing a causal version of Dilated Neighborhood Attention while incorporating it with convolutions.
Our model achieves comparable, better, or state-of-the-art performance over TCNs, TCAN, LSTMs, and GRUs while requiring fewer parameters on standard emotion recognition datasets.
arXiv Detail & Related papers (2023-12-12T18:41:30Z) - EAA-Net: Rethinking the Autoencoder Architecture with Intra-class
Features for Medical Image Segmentation [4.777011444412729]
We propose a light-weight end-to-end segmentation framework based on multi-task learning, termed Edge Attention autoencoder Network (EAA-Net)
Our approach not only utilizes the segmentation network to obtain inter-class features, but also applies the reconstruction network to extract intra-class features among the foregrounds.
Experimental results show that our method performs well in medical image segmentation tasks.
arXiv Detail & Related papers (2022-08-19T07:42:55Z) - SoftPool++: An Encoder-Decoder Network for Point Cloud Completion [93.54286830844134]
We propose a novel convolutional operator for the task of point cloud completion.
The proposed operator does not require any max-pooling or voxelization operation.
We show that our approach achieves state-of-the-art performance in shape completion at low and high resolutions.
arXiv Detail & Related papers (2022-05-08T15:31:36Z) - PlutoNet: An Efficient Polyp Segmentation Network with Modified Partial
Decoder and Decoder Consistency Training [0.40611352512781856]
We propose PlutoNet for polyp segmentation which requires only 2,626,537 parameters, less than 10% of the parameters required by its counterparts.
We train the modified partial decoder and the auxiliary decoder with a combined loss to enforce consistency, which helps improve the encoders representations.
We perform ablation studies and extensive experiments which show that PlutoNet performs significantly better than the state-of-the-art models.
arXiv Detail & Related papers (2022-04-06T20:29:00Z) - Crosslink-Net: Double-branch Encoder Segmentation Network via Fusing
Vertical and Horizontal Convolutions [58.71117402626524]
We present a novel double-branch encoder architecture for medical image segmentation.
Our architecture is inspired by two observations: 1) Since the discrimination of features learned via square convolutional kernels needs to be further improved, we propose to utilize non-square vertical and horizontal convolutional kernels.
The experiments validate the effectiveness of our model on four datasets.
arXiv Detail & Related papers (2021-07-24T02:58:32Z) - Boundary-Aware Segmentation Network for Mobile and Web Applications [60.815545591314915]
Boundary-Aware Network (BASNet) is integrated with a predict-refine architecture and a hybrid loss for highly accurate image segmentation.
BASNet runs at over 70 fps on a single GPU which benefits many potential real applications.
Based on BASNet, we further developed two (close to) commercial applications: AR COPY & PASTE, in which BASNet is augmented reality for "COPY" and "PASTING" real-world objects, and OBJECT CUT, which is a web-based tool for automatic object background removal.
arXiv Detail & Related papers (2021-01-12T19:20:26Z) - Multi-stage Attention ResU-Net for Semantic Segmentation of
Fine-Resolution Remote Sensing Images [9.398340832493457]
We propose a Linear Attention Mechanism (LAM) to address this issue.
LAM is approximately equivalent to dot-product attention with computational efficiency.
We design a Multi-stage Attention ResU-Net for semantic segmentation from fine-resolution remote sensing images.
arXiv Detail & Related papers (2020-11-29T07:24:21Z) - Unsupervised Monocular Depth Learning with Integrated Intrinsics and
Spatio-Temporal Constraints [61.46323213702369]
This work presents an unsupervised learning framework that is able to predict at-scale depth maps and egomotion.
Our results demonstrate strong performance when compared to the current state-of-the-art on multiple sequences of the KITTI driving dataset.
arXiv Detail & Related papers (2020-11-02T22:26:58Z) - Suppress and Balance: A Simple Gated Network for Salient Object
Detection [89.88222217065858]
We propose a simple gated network (GateNet) to solve both issues at once.
With the help of multilevel gate units, the valuable context information from the encoder can be optimally transmitted to the decoder.
In addition, we adopt the atrous spatial pyramid pooling based on the proposed "Fold" operation (Fold-ASPP) to accurately localize salient objects of various scales.
arXiv Detail & Related papers (2020-07-16T02:00:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.