FBSNet: A Fast Bilateral Symmetrical Network for Real-Time Semantic
Segmentation
- URL: http://arxiv.org/abs/2109.00699v1
- Date: Thu, 2 Sep 2021 04:16:39 GMT
- Title: FBSNet: A Fast Bilateral Symmetrical Network for Real-Time Semantic
Segmentation
- Authors: Guangwei Gao, Guoan Xu, Juncheng Li, Yi Yu, Huimin Lu, and Jian Yang
- Abstract summary: We propose a Fast Bilateral Symmetrical Network (FBSNet) for real-time semantic segmentation.
FBSNet employs a symmetrical-decoder structure with two branches, semantic information branch, and spatial detail branch.
The experimental results of Cityscapes and CamVid show that the proposed FBSNet can strike a good balance between accuracy and efficiency.
- Score: 23.25851281719734
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-time semantic segmentation, which can be visually understood as the
pixel-level classification task on the input image, currently has broad
application prospects, especially in the fast-developing fields of autonomous
driving and drone navigation. However, the huge burden of calculation together
with redundant parameters are still the obstacles to its technological
development. In this paper, we propose a Fast Bilateral Symmetrical Network
(FBSNet) to alleviate the above challenges. Specifically, FBSNet employs a
symmetrical encoder-decoder structure with two branches, semantic information
branch, and spatial detail branch. The semantic information branch is the main
branch with deep network architecture to acquire the contextual information of
the input image and meanwhile acquire sufficient receptive field. While spatial
detail branch is a shallow and simple network used to establish local
dependencies of each pixel for preserving details, which is essential for
restoring the original resolution during the decoding phase. Meanwhile, a
feature aggregation module (FAM) is designed to effectively combine the output
features of the two branches. The experimental results of Cityscapes and CamVid
show that the proposed FBSNet can strike a good balance between accuracy and
efficiency. Specifically, it obtains 70.9\% and 68.9\% mIoU along with the
inference speed of 90 fps and 120 fps on these two test datasets, respectively,
with only 0.62 million parameters on a single RTX 2080Ti GPU.
Related papers
- Unite-Divide-Unite: Joint Boosting Trunk and Structure for High-accuracy
Dichotomous Image Segmentation [48.995367430746086]
High-accuracy Dichotomous Image rendering (DIS) aims to pinpoint category-agnostic foreground objects from natural scenes.
We introduce a novel Unite-Divide-Unite Network (UDUN) that restructures and bipartitely arranges complementary features to boost the effectiveness of trunk and structure identification.
Using 1024*1024 input, our model enables real-time inference at 65.3 fps with ResNet-18.
arXiv Detail & Related papers (2023-07-26T09:04:35Z) - PIDNet: A Real-time Semantic Segmentation Network Inspired by PID
Controllers [6.0653144230649865]
Two-branch network architecture has shown its efficiency and effectiveness in real-time semantic segmentation tasks.
We propose a novel three-branch network architecture: PIDNet, which contains three branches to parse detailed, context and boundary information.
Our family of PIDNets achieve the best trade-off between inference speed and accuracy and their accuracy surpasses all the existing models with similar inference speed on the Cityscapes and CamVid datasets.
arXiv Detail & Related papers (2022-06-04T23:16:52Z) - Adjacent Context Coordination Network for Salient Object Detection in
Optical Remote Sensing Images [102.75699068451166]
We propose a novel Adjacent Context Coordination Network (ACCoNet) to explore the coordination of adjacent features in an encoder-decoder architecture for optical RSI-SOD.
The proposed ACCoNet outperforms 22 state-of-the-art methods under nine evaluation metrics, and runs up to 81 fps on a single NVIDIA Titan X GPU.
arXiv Detail & Related papers (2022-03-25T14:14:55Z) - Stage-Aware Feature Alignment Network for Real-Time Semantic
Segmentation of Street Scenes [59.81228011432776]
We present a novel Stage-aware Feature Alignment Network (SFANet) for real-time semantic segmentation of street scenes.
By taking into account the unique role of each stage in the decoder, a novel stage-aware Feature Enhancement Block (FEB) is designed to enhance spatial details and contextual information of feature maps from the encoder.
Experimental results show that the proposed SFANet exhibits a good balance between accuracy and speed for real-time semantic segmentation of street scenes.
arXiv Detail & Related papers (2022-03-08T11:46:41Z) - Feature Reuse and Fusion for Real-time Semantic segmentation [0.0]
How to increase the speed while maintaining high resolution is a problem that has been discussed and solved.
We hope to design a light-weight network based on previous design experience and reach the level of state-of-the-art real-time semantic segmentation.
arXiv Detail & Related papers (2021-05-27T06:47:02Z) - BiCnet-TKS: Learning Efficient Spatial-Temporal Representation for Video
Person Re-Identification [86.73532136686438]
We present an efficient spatial-temporal representation for video person re-identification (reID)
We propose a Bilateral Complementary Network (BiCnet) for spatial complementarity modeling.
BiCnet-TKS outperforms state-of-the-arts with about 50% less computations.
arXiv Detail & Related papers (2021-04-30T06:44:34Z) - Rethinking BiSeNet For Real-time Semantic Segmentation [6.622485130017622]
BiSeNet has been proved to be a popular two-stream network for real-time segmentation.
We propose a novel structure named Short-Term Dense Concatenate network (STDC) by removing structure redundancy.
arXiv Detail & Related papers (2021-04-27T13:49:47Z) - MSCFNet: A Lightweight Network With Multi-Scale Context Fusion for
Real-Time Semantic Segmentation [27.232578592161673]
We devise a novel lightweight network using a multi-scale context fusion scheme (MSCFNet)
The proposed MSCFNet contains only 1.15M parameters, achieves 71.9% Mean IoU and can run at over 50 FPS on a single Titan XP GPU configuration.
arXiv Detail & Related papers (2021-03-24T08:28:26Z) - Boundary-Aware Segmentation Network for Mobile and Web Applications [60.815545591314915]
Boundary-Aware Network (BASNet) is integrated with a predict-refine architecture and a hybrid loss for highly accurate image segmentation.
BASNet runs at over 70 fps on a single GPU which benefits many potential real applications.
Based on BASNet, we further developed two (close to) commercial applications: AR COPY & PASTE, in which BASNet is augmented reality for "COPY" and "PASTING" real-world objects, and OBJECT CUT, which is a web-based tool for automatic object background removal.
arXiv Detail & Related papers (2021-01-12T19:20:26Z) - Real-time Semantic Segmentation with Context Aggregation Network [14.560708848716754]
We propose a dual branch convolutional neural network, with significantly lower computational costs as compared to the state-of-the-art.
We evaluate our method on two semantic segmentation datasets, namely Cityscapes dataset and UAVid dataset.
arXiv Detail & Related papers (2020-11-02T14:16:23Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.