Feature Reuse and Fusion for Real-time Semantic segmentation
- URL: http://arxiv.org/abs/2105.12964v1
- Date: Thu, 27 May 2021 06:47:02 GMT
- Title: Feature Reuse and Fusion for Real-time Semantic segmentation
- Authors: Tan Sixiang
- Abstract summary: How to increase the speed while maintaining high resolution is a problem that has been discussed and solved.
We hope to design a light-weight network based on previous design experience and reach the level of state-of-the-art real-time semantic segmentation.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: For real-time semantic segmentation, how to increase the speed while
maintaining high resolution is a problem that has been discussed and solved.
Backbone design and fusion design have always been two essential parts of
real-time semantic segmentation. We hope to design a light-weight network based
on previous design experience and reach the level of state-of-the-art real-time
semantic segmentation without any pre-training. To achieve this goal, a
encoder-decoder architectures are proposed to solve this problem by applying a
decoder network onto a backbone model designed for real-time segmentation tasks
and designed three different ways to fuse semantics and detailed information in
the aggregation phase. We have conducted extensive experiments on two semantic
segmentation benchmarks. Experiments on the Cityscapes and CamVid datasets show
that the proposed FRFNet strikes a balance between speed calculation and
accuracy. It achieves 76.4\% Mean Intersection over Union (mIoU\%) on the
Cityscapes test dataset with the speed of 161 FPS on a single RTX 2080Ti card.
The Code is available at https://github.com/favoMJ/FRFNet.
Related papers
- Cross-CBAM: A Lightweight network for Scene Segmentation [2.064612766965483]
We present the Cross-CBAM network, a novel lightweight network for real-time semantic segmentation.
In experiments on the Cityscapes dataset and Camvid dataset, we achieve 73.4% mIoU with a speed of 240.9FPS and 77.2% mIoU with a speed of 88.6FPS on NVIDIA GTX 1080Ti.
arXiv Detail & Related papers (2023-06-04T09:03:05Z) - Distortion-Aware Network Pruning and Feature Reuse for Real-time Video
Segmentation [49.17930380106643]
We propose a novel framework to speed up any architecture with skip-connections for real-time vision tasks.
Specifically, at the arrival of each frame, we transform the features from the previous frame to reuse them at specific spatial bins.
We then perform partial computation of the backbone network on the regions of the current frame that captures temporal differences between the current and previous frame.
arXiv Detail & Related papers (2022-06-20T07:20:02Z) - Stage-Aware Feature Alignment Network for Real-Time Semantic
Segmentation of Street Scenes [59.81228011432776]
We present a novel Stage-aware Feature Alignment Network (SFANet) for real-time semantic segmentation of street scenes.
By taking into account the unique role of each stage in the decoder, a novel stage-aware Feature Enhancement Block (FEB) is designed to enhance spatial details and contextual information of feature maps from the encoder.
Experimental results show that the proposed SFANet exhibits a good balance between accuracy and speed for real-time semantic segmentation of street scenes.
arXiv Detail & Related papers (2022-03-08T11:46:41Z) - FBSNet: A Fast Bilateral Symmetrical Network for Real-Time Semantic
Segmentation [23.25851281719734]
We propose a Fast Bilateral Symmetrical Network (FBSNet) for real-time semantic segmentation.
FBSNet employs a symmetrical-decoder structure with two branches, semantic information branch, and spatial detail branch.
The experimental results of Cityscapes and CamVid show that the proposed FBSNet can strike a good balance between accuracy and efficiency.
arXiv Detail & Related papers (2021-09-02T04:16:39Z) - Rethinking BiSeNet For Real-time Semantic Segmentation [6.622485130017622]
BiSeNet has been proved to be a popular two-stream network for real-time segmentation.
We propose a novel structure named Short-Term Dense Concatenate network (STDC) by removing structure redundancy.
arXiv Detail & Related papers (2021-04-27T13:49:47Z) - Real-time Semantic Segmentation with Fast Attention [94.88466483540692]
We propose a novel architecture for semantic segmentation of high-resolution images and videos in real-time.
The proposed architecture relies on our fast spatial attention, which is a simple yet efficient modification of the popular self-attention mechanism.
We show that results on multiple datasets demonstrate superior performance with better accuracy and speed compared to existing approaches.
arXiv Detail & Related papers (2020-07-07T22:37:16Z) - BiSeNet V2: Bilateral Network with Guided Aggregation for Real-time
Semantic Segmentation [118.46210049742993]
We propose an efficient and effective architecture with a good trade-off between speed and accuracy, termed Bilateral spatial Network (BiSeNet V2)
For a 2,048x1, input, we achieve 72.6% Mean IoU on the Cityscapes test set with a speed of 156 FPS on one NVIDIA GeForce 1080 Ti card, which is significantly faster than existing methods, yet we achieve better segmentation accuracy.
arXiv Detail & Related papers (2020-04-05T10:26:38Z) - Temporally Distributed Networks for Fast Video Semantic Segmentation [64.5330491940425]
TDNet is a temporally distributed network designed for fast and accurate video semantic segmentation.
We observe that features extracted from a certain high-level layer of a deep CNN can be approximated by composing features extracted from several shallower sub-networks.
Experiments on Cityscapes, CamVid, and NYUD-v2 demonstrate that our method achieves state-of-the-art accuracy with significantly faster speed and lower latency.
arXiv Detail & Related papers (2020-04-03T22:43:32Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.