SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks
- URL: http://arxiv.org/abs/2401.15741v7
- Date: Tue, 2 Jul 2024 15:48:30 GMT
- Title: SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks
- Authors: Serdar Erisen,
- Abstract summary: This research proposes an encoder-decoder architecture with a unique efficient residual network, Efficient-ResNet.
Attention-boosting gates (AbGs) and attention-boosting modules (AbMs) are deployed by aiming to fuse the equivariant and feature-based semantic information with the equivalent sizes of the output of global context.
Our network is tested on the challenging CamVid and Cityscapes datasets, and the proposed methods reveal significant improvements on the residual networks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Improving the efficiency of state-of-the-art methods in semantic segmentation requires overcoming the increasing computational cost as well as issues such as fusing semantic information from global and local contexts. Based on the recent success and problems that convolutional neural networks (CNNs) encounter in semantic segmentation, this research proposes an encoder-decoder architecture with a unique efficient residual network, Efficient-ResNet. Attention-boosting gates (AbGs) and attention-boosting modules (AbMs) are deployed by aiming to fuse the equivariant and feature-based semantic information with the equivalent sizes of the output of global context of the efficient residual network in the encoder. Respectively, the decoder network is developed with the additional attention-fusion networks (AfNs) inspired by AbM. AfNs are designed to improve the efficiency in the one-to-one conversion of the semantic information by deploying additional convolution layers in the decoder part. Our network is tested on the challenging CamVid and Cityscapes datasets, and the proposed methods reveal significant improvements on the residual networks. To the best of our knowledge, the developed network, SERNet-Former, achieves state-of-the-art results (84.62 % mean IoU) on CamVid dataset and challenging results (87.35 % mean IoU) on Cityscapes validation dataset.
Related papers
- ELGC-Net: Efficient Local-Global Context Aggregation for Remote Sensing Change Detection [65.59969454655996]
We propose an efficient change detection framework, ELGC-Net, which leverages rich contextual information to precisely estimate change regions.
Our proposed ELGC-Net sets a new state-of-the-art performance in remote sensing change detection benchmarks.
We also introduce ELGC-Net-LW, a lighter variant with significantly reduced computational complexity, suitable for resource-constrained settings.
arXiv Detail & Related papers (2024-03-26T17:46:25Z) - Densely Decoded Networks with Adaptive Deep Supervision for Medical
Image Segmentation [19.302294715542175]
We propose densely decoded networks (ddn), by selectively introducing 'crutch' network connections.
Such 'crutch' connections in each upsampling stage of the network decoder enhance target localization.
We also present a training strategy based on adaptive deep supervision (ads), which exploits and adapts specific attributes of input dataset.
arXiv Detail & Related papers (2024-02-05T00:44:57Z) - Unsupervised Graph Attention Autoencoder for Attributed Networks using
K-means Loss [0.0]
We introduce a simple, efficient, and clustering-oriented model based on unsupervised textbfGraph Attention textbfAutotextbfEncoder for community detection in attributed networks.
The proposed model adeptly learns representations from both the network's topology and attribute information, simultaneously addressing dual objectives: reconstruction and community discovery.
arXiv Detail & Related papers (2023-11-21T20:45:55Z) - Accurate and Efficient Event-based Semantic Segmentation Using Adaptive Spiking Encoder-Decoder Network [20.05283214295881]
Spiking neural networks (SNNs) are emerging as promising solutions for processing dynamic, asynchronous signals from event-based sensors.
We develop an efficient spiking encoder-decoder network (SpikingEDN) for large-scale event-based semantic segmentation tasks.
We harness the adaptive threshold which improves network accuracy, sparsity and robustness in streaming inference.
arXiv Detail & Related papers (2023-04-24T07:12:50Z) - SAR Despeckling Using Overcomplete Convolutional Networks [53.99620005035804]
despeckling is an important problem in remote sensing as speckle degrades SAR images.
Recent studies show that convolutional neural networks(CNNs) outperform classical despeckling methods.
This study employs an overcomplete CNN architecture to focus on learning low-level features by restricting the receptive field.
We show that the proposed network improves despeckling performance compared to recent despeckling methods on synthetic and real SAR images.
arXiv Detail & Related papers (2022-05-31T15:55:37Z) - Hybrid SNN-ANN: Energy-Efficient Classification and Object Detection for
Event-Based Vision [64.71260357476602]
Event-based vision sensors encode local pixel-wise brightness changes in streams of events rather than image frames.
Recent progress in object recognition from event-based sensors has come from conversions of deep neural networks.
We propose a hybrid architecture for end-to-end training of deep neural networks for event-based pattern recognition and object detection.
arXiv Detail & Related papers (2021-12-06T23:45:58Z) - AttDLNet: Attention-based DL Network for 3D LiDAR Place Recognition [0.6352264764099531]
This paper proposes a novel 3D LiDAR-based deep learning network named AttDLNet.
It exploits an attention mechanism to selectively focus on long-range context and interfeature relationships.
Results show that the encoder network features are already very descriptive, but adding attention to the network further improves performance.
arXiv Detail & Related papers (2021-06-17T16:34:37Z) - DRU-net: An Efficient Deep Convolutional Neural Network for Medical
Image Segmentation [2.3574651879602215]
Residual network (ResNet) and densely connected network (DenseNet) have significantly improved the training efficiency and performance of deep convolutional neural networks (DCNNs)
We propose an efficient network architecture by considering advantages of both networks.
arXiv Detail & Related papers (2020-04-28T12:16:24Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z) - Resolution Adaptive Networks for Efficient Inference [53.04907454606711]
We propose a novel Resolution Adaptive Network (RANet), which is inspired by the intuition that low-resolution representations are sufficient for classifying "easy" inputs.
In RANet, the input images are first routed to a lightweight sub-network that efficiently extracts low-resolution representations.
High-resolution paths in the network maintain the capability to recognize the "hard" samples.
arXiv Detail & Related papers (2020-03-16T16:54:36Z) - Dense Residual Network: Enhancing Global Dense Feature Flow for
Character Recognition [75.4027660840568]
This paper explores how to enhance the local and global dense feature flow by exploiting hierarchical features fully from all the convolution layers.
Technically, we propose an efficient and effective CNN framework, i.e., Fast Dense Residual Network (FDRN) for text recognition.
arXiv Detail & Related papers (2020-01-23T06:55:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.