Dilated SpineNet for Semantic Segmentation
- URL: http://arxiv.org/abs/2103.12270v1
- Date: Tue, 23 Mar 2021 02:39:04 GMT
- Title: Dilated SpineNet for Semantic Segmentation
- Authors: Abdullah Rashwan and Xianzhi Du and Xiaoqi Yin and Jing Li
- Abstract summary: Scale-permuted networks have shown promising results on object bounding box detection and instance segmentation.
In this work, we evaluate this meta-architecture design on semantic segmentation.
We propose SpineNet-Seg, a network discovered by NAS that is searched from the DeepLabv3 system.
- Score: 5.6590540986523035
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scale-permuted networks have shown promising results on object bounding box
detection and instance segmentation. Scale permutation and cross-scale fusion
of features enable the network to capture multi-scale semantics while
preserving spatial resolution. In this work, we evaluate this meta-architecture
design on semantic segmentation - another vision task that benefits from high
spatial resolution and multi-scale feature fusion at different network stages.
By further leveraging dilated convolution operations, we propose SpineNet-Seg,
a network discovered by NAS that is searched from the DeepLabv3 system.
SpineNet-Seg is designed with a better scale-permuted network topology with
customized dilation ratios per block on a semantic segmentation task.
SpineNet-Seg models outperform the DeepLabv3/v3+ baselines at all model scales
on multiple popular benchmarks in speed and accuracy. In particular, our
SpineNet-S143+ model achieves the new state-of-the-art on the popular
Cityscapes benchmark at 83.04% mIoU and attained strong performance on the
PASCAL VOC2012 benchmark at 85.56% mIoU. SpineNet-Seg models also show
promising results on a challenging Street View segmentation dataset. Code and
checkpoints will be open-sourced.
Related papers
- Y-CA-Net: A Convolutional Attention Based Network for Volumetric Medical Image Segmentation [47.12719953712902]
discriminative local features are key components for the performance of attention-based VS methods.
We incorporate the convolutional encoder branch with transformer backbone to extract local and global features in a parallel manner.
Y-CT-Net achieves competitive performance on multiple medical segmentation tasks.
arXiv Detail & Related papers (2024-10-01T18:50:45Z) - PointeNet: A Lightweight Framework for Effective and Efficient Point
Cloud Analysis [28.54939134635978]
PointeNet is a network designed specifically for point cloud analysis.
Our method demonstrates flexibility by seamlessly integrating with a classification/segmentation head or embedding into off-the-shelf 3D object detection networks.
Experiments on object-level datasets, including ModelNet40, ScanObjectNN, ShapeNet KITTI, and the scene-level dataset KITTI, demonstrate the superior performance of PointeNet over state-of-the-art methods in point cloud analysis.
arXiv Detail & Related papers (2023-12-20T03:34:48Z) - SODAWideNet -- Salient Object Detection with an Attention augmented Wide
Encoder Decoder network without ImageNet pre-training [3.66237529322911]
We explore developing a neural network from scratch directly trained on Salient Object Detection without ImageNet pre-training.
We propose SODAWideNet, an encoder-decoder-style network for Salient Object Detection.
Two variants, SODAWideNet-S (3.03M) and SODAWideNet (9.03M), achieve competitive performance against state-of-the-art models on five datasets.
arXiv Detail & Related papers (2023-11-08T16:53:44Z) - SVNet: Where SO(3) Equivariance Meets Binarization on Point Cloud
Representation [65.4396959244269]
The paper tackles the challenge by designing a general framework to construct 3D learning architectures.
The proposed approach can be applied to general backbones like PointNet and DGCNN.
Experiments on ModelNet40, ShapeNet, and the real-world dataset ScanObjectNN, demonstrated that the method achieves a great trade-off between efficiency, rotation, and accuracy.
arXiv Detail & Related papers (2022-09-13T12:12:19Z) - Lightweight and Progressively-Scalable Networks for Semantic
Segmentation [100.63114424262234]
Multi-scale learning frameworks have been regarded as a capable class of models to boost semantic segmentation.
In this paper, we thoroughly analyze the design of convolutional blocks and the ways of interactions across multiple scales.
We devise Lightweight and Progressively-Scalable Networks (LPS-Net) that novelly expands the network complexity in a greedy manner.
arXiv Detail & Related papers (2022-07-27T16:00:28Z) - Point-Unet: A Context-aware Point-based Neural Network for Volumetric
Segmentation [18.81644604997336]
We propose Point-Unet, a novel method that incorporates the efficiency of deep learning with 3D point clouds into volumetric segmentation.
Our key idea is to first predict the regions of interest in the volume by learning an attentional probability map.
A comprehensive benchmark on different metrics has shown that our context-aware Point-Unet robustly outperforms the SOTA voxel-based networks.
arXiv Detail & Related papers (2022-03-16T22:02:08Z) - Background-Aware 3D Point Cloud Segmentationwith Dynamic Point Feature
Aggregation [12.093182949686781]
We propose a novel 3D point cloud learning network, referred to as Dynamic Point Feature Aggregation Network (DPFA-Net)
DPFA-Net has two variants for semantic segmentation and classification of 3D point clouds.
It achieves the state-of-the-art overall accuracy score for semantic segmentation on the S3DIS dataset.
arXiv Detail & Related papers (2021-11-14T05:46:05Z) - Location-Sensitive Visual Recognition with Cross-IOU Loss [177.86369890708457]
This paper proposes a unified solution named location-sensitive network (LSNet) for object detection, instance segmentation, and pose estimation.
Based on a deep neural network as the backbone, LSNet predicts an anchor point and a set of landmarks which together define the shape of the target object.
arXiv Detail & Related papers (2021-04-11T02:17:14Z) - SALA: Soft Assignment Local Aggregation for Parameter Efficient 3D
Semantic Segmentation [65.96170587706148]
We focus on designing a point local aggregation function that yields parameter efficient networks for 3D point cloud semantic segmentation.
We explore the idea of using learnable neighbor-to-grid soft assignment in grid-based aggregation functions.
arXiv Detail & Related papers (2020-12-29T20:16:37Z) - Real-Time Semantic Segmentation via Auto Depth, Downsampling Joint
Decision and Feature Aggregation [54.28963233377946]
We propose a joint search framework, called AutoRTNet, to automate the design of segmentation strategies.
Specifically, we propose hyper-cells to jointly decide the network depth and downsampling strategy, and an aggregation cell to achieve automatic multi-scale feature aggregation.
arXiv Detail & Related papers (2020-03-31T14:02:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.