Deep feature selection-and-fusion for RGB-D semantic segmentation
- URL: http://arxiv.org/abs/2105.04102v1
- Date: Mon, 10 May 2021 04:02:32 GMT
- Title: Deep feature selection-and-fusion for RGB-D semantic segmentation
- Authors: Yuejiao Su, Yuan Yuan, Zhiyu Jiang
- Abstract summary: This work proposes a unified and efficient feature selectionand-fusion network (FSFNet)
FSFNet contains a symmetric cross-modality residual fusion module used for explicit fusion of multi-modality information.
Compared with the state-of-the-art methods, experimental evaluations demonstrate that the proposed model achieves competitive performance on two public datasets.
- Score: 8.831857715361624
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scene depth information can help visual information for more accurate
semantic segmentation. However, how to effectively integrate multi-modality
information into representative features is still an open problem. Most of the
existing work uses DCNNs to implicitly fuse multi-modality information. But as
the network deepens, some critical distinguishing features may be lost, which
reduces the segmentation performance. This work proposes a unified and
efficient feature selectionand-fusion network (FSFNet), which contains a
symmetric cross-modality residual fusion module used for explicit fusion of
multi-modality information. Besides, the network includes a detailed feature
propagation module, which is used to maintain low-level detailed information
during the forward process of the network. Compared with the state-of-the-art
methods, experimental evaluations demonstrate that the proposed model achieves
competitive performance on two public datasets.
Related papers
- FANet: Feature Amplification Network for Semantic Segmentation in Cluttered Background [9.970265640589966]
Existing deep learning approaches leave out the semantic cues that are crucial in semantic segmentation present in complex scenarios.
We propose a feature amplification network (FANet) as a backbone network that incorporates semantic information using a novel feature enhancement module at multi-stages.
Our experimental results demonstrate the state-of-the-art performance compared to existing methods.
arXiv Detail & Related papers (2024-07-12T15:57:52Z) - MCFNet: Multi-scale Covariance Feature Fusion Network for Real-time
Semantic Segmentation [6.0118706234809975]
We propose a new architecture based on Bilateral Network (BiseNet) called Multi-scale Covariance Feature Fusion Network (MCFNet)
Specifically, this network introduces a new feature refinement module and a new feature fusion module.
We evaluate our proposed model on Cityscapes, CamVid datasets and compare it with the state-of-the-art methods.
arXiv Detail & Related papers (2023-12-12T12:20:27Z) - Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs.
Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - Feature Aggregation and Propagation Network for Camouflaged Object
Detection [42.33180748293329]
Camouflaged object detection (COD) aims to detect/segment camouflaged objects embedded in the environment.
Several COD methods have been developed, but they still suffer from unsatisfactory performance due to intrinsic similarities between foreground objects and background surroundings.
We propose a novel Feature Aggregation and propagation Network (FAP-Net) for camouflaged object detection.
arXiv Detail & Related papers (2022-12-02T05:54:28Z) - PSNet: Parallel Symmetric Network for Video Salient Object Detection [85.94443548452729]
We propose a VSOD network with up and down parallel symmetry, named PSNet.
Two parallel branches with different dominant modalities are set to achieve complete video saliency decoding.
arXiv Detail & Related papers (2022-10-12T04:11:48Z) - Specificity-preserving RGB-D Saliency Detection [103.3722116992476]
We propose a specificity-preserving network (SP-Net) for RGB-D saliency detection.
Two modality-specific networks and a shared learning network are adopted to generate individual and shared saliency maps.
Experiments on six benchmark datasets demonstrate that our SP-Net outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2021-08-18T14:14:22Z) - EPMF: Efficient Perception-aware Multi-sensor Fusion for 3D Semantic Segmentation [62.210091681352914]
We study multi-sensor fusion for 3D semantic segmentation for many applications, such as autonomous driving and robotics.
In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF)
We propose a two-stream network to extract features from the two modalities separately. The extracted features are fused by effective residual-based fusion modules.
arXiv Detail & Related papers (2021-06-21T10:47:26Z) - CSRNet: Cascaded Selective Resolution Network for Real-time Semantic
Segmentation [18.63596070055678]
We propose a light Cascaded Selective Resolution Network (CSRNet) to improve the performance of real-time segmentation.
The proposed network builds a three-stage segmentation system, which integrates feature information from low resolution to high resolution.
Experiments on two well-known datasets demonstrate that the proposed CSRNet effectively improves the performance for real-time segmentation.
arXiv Detail & Related papers (2021-06-08T14:22:09Z) - Deep Multimodal Fusion by Channel Exchanging [87.40768169300898]
This paper proposes a parameter-free multimodal fusion framework that dynamically exchanges channels between sub-networks of different modalities.
The validity of such exchanging process is also guaranteed by sharing convolutional filters yet keeping separate BN layers across modalities, which, as an add-on benefit, allows our multimodal architecture to be almost as compact as a unimodal network.
arXiv Detail & Related papers (2020-11-10T09:53:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.