BiSeNet V2: Bilateral Network with Guided Aggregation for Real-time
Semantic Segmentation
- URL: http://arxiv.org/abs/2004.02147v1
- Date: Sun, 5 Apr 2020 10:26:38 GMT
- Title: BiSeNet V2: Bilateral Network with Guided Aggregation for Real-time
Semantic Segmentation
- Authors: Changqian Yu, Changxin Gao, Jingbo Wang, Gang Yu, Chunhua Shen, Nong
Sang
- Abstract summary: We propose an efficient and effective architecture with a good trade-off between speed and accuracy, termed Bilateral spatial Network (BiSeNet V2)
For a 2,048x1, input, we achieve 72.6% Mean IoU on the Cityscapes test set with a speed of 156 FPS on one NVIDIA GeForce 1080 Ti card, which is significantly faster than existing methods, yet we achieve better segmentation accuracy.
- Score: 118.46210049742993
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The low-level details and high-level semantics are both essential to the
semantic segmentation task. However, to speed up the model inference, current
approaches almost always sacrifice the low-level details, which leads to a
considerable accuracy decrease. We propose to treat these spatial details and
categorical semantics separately to achieve high accuracy and high efficiency
for realtime semantic segmentation. To this end, we propose an efficient and
effective architecture with a good trade-off between speed and accuracy, termed
Bilateral Segmentation Network (BiSeNet V2). This architecture involves: (i) a
Detail Branch, with wide channels and shallow layers to capture low-level
details and generate high-resolution feature representation; (ii) a Semantic
Branch, with narrow channels and deep layers to obtain high-level semantic
context. The Semantic Branch is lightweight due to reducing the channel
capacity and a fast-downsampling strategy. Furthermore, we design a Guided
Aggregation Layer to enhance mutual connections and fuse both types of feature
representation. Besides, a booster training strategy is designed to improve the
segmentation performance without any extra inference cost. Extensive
quantitative and qualitative evaluations demonstrate that the proposed
architecture performs favourably against a few state-of-the-art real-time
semantic segmentation approaches. Specifically, for a 2,048x1,024 input, we
achieve 72.6% Mean IoU on the Cityscapes test set with a speed of 156 FPS on
one NVIDIA GeForce GTX 1080 Ti card, which is significantly faster than
existing methods, yet we achieve better segmentation accuracy.
Related papers
- Cross-CBAM: A Lightweight network for Scene Segmentation [2.064612766965483]
We present the Cross-CBAM network, a novel lightweight network for real-time semantic segmentation.
In experiments on the Cityscapes dataset and Camvid dataset, we achieve 73.4% mIoU with a speed of 240.9FPS and 77.2% mIoU with a speed of 88.6FPS on NVIDIA GTX 1080Ti.
arXiv Detail & Related papers (2023-06-04T09:03:05Z) - RTFormer: Efficient Design for Real-Time Semantic Segmentation with
Transformer [63.25665813125223]
We propose RTFormer, an efficient dual-resolution transformer for real-time semantic segmenation.
It achieves better trade-off between performance and efficiency than CNN-based models.
Experiments on mainstream benchmarks demonstrate the effectiveness of our proposed RTFormer.
arXiv Detail & Related papers (2022-10-13T16:03:53Z) - Rethinking BiSeNet For Real-time Semantic Segmentation [6.622485130017622]
BiSeNet has been proved to be a popular two-stream network for real-time segmentation.
We propose a novel structure named Short-Term Dense Concatenate network (STDC) by removing structure redundancy.
arXiv Detail & Related papers (2021-04-27T13:49:47Z) - A Holistically-Guided Decoder for Deep Representation Learning with
Applications to Semantic Segmentation and Object Detection [74.88284082187462]
One common strategy is to adopt dilated convolutions in the backbone networks to extract high-resolution feature maps.
We propose one novel holistically-guided decoder which is introduced to obtain the high-resolution semantic-rich feature maps.
arXiv Detail & Related papers (2020-12-18T10:51:49Z) - Real-time Semantic Segmentation with Context Aggregation Network [14.560708848716754]
We propose a dual branch convolutional neural network, with significantly lower computational costs as compared to the state-of-the-art.
We evaluate our method on two semantic segmentation datasets, namely Cityscapes dataset and UAVid dataset.
arXiv Detail & Related papers (2020-11-02T14:16:23Z) - Real-time Semantic Segmentation with Fast Attention [94.88466483540692]
We propose a novel architecture for semantic segmentation of high-resolution images and videos in real-time.
The proposed architecture relies on our fast spatial attention, which is a simple yet efficient modification of the popular self-attention mechanism.
We show that results on multiple datasets demonstrate superior performance with better accuracy and speed compared to existing approaches.
arXiv Detail & Related papers (2020-07-07T22:37:16Z) - Real-time Semantic Segmentation via Spatial-detail Guided Context
Propagation [49.70144583431999]
We propose the spatial-detail guided context propagation network (SGCPNet) for achieving real-time semantic segmentation.
It uses the spatial details of shallow layers to guide the propagation of the low-resolution global contexts, in which the lost spatial information can be effectively reconstructed.
It achieves 69.5% mIoU segmentation accuracy, while its speed reaches 178.5 FPS on 768x1536 images on a GeForce GTX 1080 Ti GPU card.
arXiv Detail & Related papers (2020-05-22T07:07:26Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.