S3Net: Innovating Stereo Matching and Semantic Segmentation with a Single-Branch Semantic Stereo Network in Satellite Epipolar Imagery
- URL: http://arxiv.org/abs/2401.01643v3
- Date: Tue, 01 Oct 2024 05:24:20 GMT
- Title: S3Net: Innovating Stereo Matching and Semantic Segmentation with a Single-Branch Semantic Stereo Network in Satellite Epipolar Imagery
- Authors: Qingyuan Yang, Guanzhou Chen, Xiaoliang Tan, Tong Wang, Jiaqi Wang, Xiaodong Zhang,
- Abstract summary: This work introduces a solution, the Single-branch Semantic Stereo Network (S3Net), which innovatively combines semantic segmentation and stereo matching.
Our method dentifies and leverages the intrinsic link between these two tasks, leading to a more accurate understanding of semantic information and disparity estimation.
Our model improves the mIoU in semantic segmentation from 61.38 to 67.39, and reduces the D1-Error and average endpoint error (EPE) in disparity estimation from 10.051 to 9.579 and 1.439 to 1.403 respectively.
- Score: 23.965291952048872
- License:
- Abstract: Stereo matching and semantic segmentation are significant tasks in binocular satellite 3D reconstruction. However, previous studies primarily view these as independent parallel tasks, lacking an integrated multitask learning framework. This work introduces a solution, the Single-branch Semantic Stereo Network (S3Net), which innovatively combines semantic segmentation and stereo matching using Self-Fuse and Mutual-Fuse modules. Unlike preceding methods that utilize semantic or disparity information independently, our method dentifies and leverages the intrinsic link between these two tasks, leading to a more accurate understanding of semantic information and disparity estimation. Comparative testing on the US3D dataset proves the effectiveness of our S3Net. Our model improves the mIoU in semantic segmentation from 61.38 to 67.39, and reduces the D1-Error and average endpoint error (EPE) in disparity estimation from 10.051 to 9.579 and 1.439 to 1.403 respectively, surpassing existing competitive methods. Our codes are available at:https://github.com/CVEO/S3Net.
Related papers
- Bayesian Self-Training for Semi-Supervised 3D Segmentation [59.544558398992386]
3D segmentation is a core problem in computer vision.
densely labeling 3D point clouds to employ fully-supervised training remains too labor intensive and expensive.
Semi-supervised training provides a more practical alternative, where only a small set of labeled data is given, accompanied by a larger unlabeled set.
arXiv Detail & Related papers (2024-09-12T14:54:31Z) - Large Generative Model Assisted 3D Semantic Communication [51.17527319441436]
We propose a Generative AI Model assisted 3D SC (GAM-3DSC) system.
First, we introduce a 3D Semantic Extractor (3DSE) to extract key semantics from a 3D scenario based on user requirements.
We then present an Adaptive Semantic Compression Model (ASCM) for encoding these multi-perspective images.
Finally, we design a conditional Generative adversarial network and Diffusion model aided-Channel Estimation (GDCE) to estimate and refine the Channel State Information (CSI) of physical channels.
arXiv Detail & Related papers (2024-03-09T03:33:07Z) - OneFormer3D: One Transformer for Unified Point Cloud Segmentation [5.530212768657545]
This paper presents a unified, simple, and effective model addressing semantic, instance, and panoptic segmentation tasks jointly.
The model, named OneFormer3D, performs instance and semantic segmentation consistently, using a group of learnable kernels.
We also demonstrate the state-of-the-art results in semantic, instance, and panoptic segmentation of ScanNet, ScanNet200, and S3DIS datasets.
arXiv Detail & Related papers (2023-11-24T10:56:27Z) - SATR: Zero-Shot Semantic Segmentation of 3D Shapes [74.08209893396271]
We explore the task of zero-shot semantic segmentation of 3D shapes by using large-scale off-the-shelf 2D image recognition models.
We develop the Assignment with Topological Reweighting (SATR) algorithm and evaluate it on ShapeNetPart and our proposed FAUST benchmarks.
SATR achieves state-of-the-art performance and outperforms a baseline algorithm by 1.3% and 4% average mIoU.
arXiv Detail & Related papers (2023-04-11T00:43:16Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - SVNet: Where SO(3) Equivariance Meets Binarization on Point Cloud
Representation [65.4396959244269]
The paper tackles the challenge by designing a general framework to construct 3D learning architectures.
The proposed approach can be applied to general backbones like PointNet and DGCNN.
Experiments on ModelNet40, ShapeNet, and the real-world dataset ScanObjectNN, demonstrated that the method achieves a great trade-off between efficiency, rotation, and accuracy.
arXiv Detail & Related papers (2022-09-13T12:12:19Z) - Unsupervised Representation Learning for 3D Point Cloud Data [66.92077180228634]
We propose a simple yet effective approach for unsupervised point cloud learning.
In particular, we identify a very useful transformation which generates a good contrastive version of an original point cloud.
We conduct experiments on three downstream tasks which are 3D object classification, shape part segmentation and scene segmentation.
arXiv Detail & Related papers (2021-10-13T10:52:45Z) - S3Net: 3D LiDAR Sparse Semantic Segmentation Network [1.330528227599978]
S3Net is a novel convolutional neural network for LiDAR point cloud semantic segmentation.
It adopts an encoder-decoder backbone that consists of Sparse Intra-channel Attention Module (SIntraAM) and Sparse Inter-channel Attention Module (SInterAM)
arXiv Detail & Related papers (2021-03-15T22:15:24Z) - (AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection
for Sparse Semantic Segmentation Network [3.6967381030744515]
We propose AF2-S3Net, an end-to-end encoder-decoder CNN network for 3D LiDAR semantic segmentation.
We present a novel multi-branch attentive feature fusion module in the encoder and a unique adaptive feature selection module with feature map re-weighting in the decoder.
Our experimental results show that the proposed method outperforms the state-of-the-art approaches on the large-scale SemanticKITTI benchmark.
arXiv Detail & Related papers (2021-02-08T21:04:21Z) - HITNet: Hierarchical Iterative Tile Refinement Network for Real-time
Stereo Matching [18.801346154045138]
HITNet is a novel neural network architecture for real-time stereo matching.
Our architecture is inherently multi-resolution allowing the propagation of information across different levels.
At the time of writing, HITNet ranks 1st-3rd on all the metrics published on the ETH3D website for two view stereo.
arXiv Detail & Related papers (2020-07-23T17:11:48Z) - JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D
Point Clouds [37.703770427574476]
In this paper, we tackle the 3D semantic edge detection task for the first time.
We present a new two-stream fully-convolutional network that jointly performs the two tasks.
In particular, we design a joint refinement module that explicitly wires region information and edge information to improve the performances of both tasks.
arXiv Detail & Related papers (2020-07-14T08:00:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.