Dynamically pruning segformer for efficient semantic segmentation
- URL: http://arxiv.org/abs/2111.09499v1
- Date: Thu, 18 Nov 2021 03:34:28 GMT
- Title: Dynamically pruning segformer for efficient semantic segmentation
- Authors: Haoli Bai, Hongda Mao, Dinesh Nair
- Abstract summary: We seek to design a lightweight SegFormer for efficient semantic segmentation.
Based on the observation that neurons in SegFormer layers exhibit large variances across different images, we propose a dynamic gated linear layer.
We also introduce two-stage knowledge distillation to transfer the knowledge within the original teacher to the pruned student network.
- Score: 8.29672153078638
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As one of the successful Transformer-based models in computer vision tasks,
SegFormer demonstrates superior performance in semantic segmentation.
Nevertheless, the high computational cost greatly challenges the deployment of
SegFormer on edge devices. In this paper, we seek to design a lightweight
SegFormer for efficient semantic segmentation. Based on the observation that
neurons in SegFormer layers exhibit large variances across different images, we
propose a dynamic gated linear layer, which prunes the most uninformative set
of neurons based on the input instance. To improve the dynamically pruned
SegFormer, we also introduce two-stage knowledge distillation to transfer the
knowledge within the original teacher to the pruned student network.
Experimental results show that our method can significantly reduce the
computation overhead of SegFormer without an apparent performance drop. For
instance, we can achieve 36.9% mIoU with only 3.3G FLOPs on ADE20K, saving more
than 60% computation with the drop of only 0.5% in mIoU
Related papers
- Dynamic layer selection in decoder-only transformers [21.18795712840146]
We empirically examine two common dynamic inference methods for natural language generation.
We find that a pre-trained decoder-only model is significantly more robust to layer removal via layer skipping.
We also show that dynamic computation allocation on a per-sequence basis holds promise for significant efficiency gains.
arXiv Detail & Related papers (2024-10-26T00:44:11Z) - No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation [40.0506169981233]
We propose a Non-parametric Network for few-shot 3D, Seg-NN, and its Parametric variant, Seg-PN.
Seg-PN extracts dense representations by hand-crafted filters and achieves comparable performance to existing parametric models.
Experiments suggest that Seg-PN outperforms previous state-of-the-art method by +4.19% and +7.71% mIoU on S3DIS and ScanNet datasets respectively.
arXiv Detail & Related papers (2024-04-05T12:09:36Z) - RTFormer: Efficient Design for Real-Time Semantic Segmentation with
Transformer [63.25665813125223]
We propose RTFormer, an efficient dual-resolution transformer for real-time semantic segmenation.
It achieves better trade-off between performance and efficiency than CNN-based models.
Experiments on mainstream benchmarks demonstrate the effectiveness of our proposed RTFormer.
arXiv Detail & Related papers (2022-10-13T16:03:53Z) - SegNeXt: Rethinking Convolutional Attention Design for Semantic
Segmentation [100.89770978711464]
We present SegNeXt, a simple convolutional network architecture for semantic segmentation.
We show that convolutional attention is a more efficient and effective way to encode contextual information than the self-attention mechanism in transformers.
arXiv Detail & Related papers (2022-09-18T14:33:49Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - SegFormer: Simple and Efficient Design for Semantic Segmentation with
Transformers [79.646577541655]
We present SegFormer, a semantic segmentation framework which unifies Transformers with lightweight multilayer perception (MLP) decoders.
SegFormer comprises a novelly structured encoder which outputs multiscale features.
The proposed decoder aggregates information from different layers, and thus combining both local attention and global attention to powerful representations.
arXiv Detail & Related papers (2021-05-31T17:59:51Z) - Scaling Semantic Segmentation Beyond 1K Classes on a Single GPU [87.48110331544885]
We propose a novel training methodology to train and scale the existing semantic segmentation models.
We demonstrate a clear benefit of our approach on a dataset with 1284 classes, bootstrapped from LVIS and COCO annotations, with three times better mIoU than the DeeplabV3+ model.
arXiv Detail & Related papers (2020-12-14T13:12:38Z) - Unifying Instance and Panoptic Segmentation with Dynamic Rank-1
Convolutions [109.2706837177222]
DR1Mask is the first panoptic segmentation framework that exploits a shared feature map for both instance and semantic segmentation.
As a byproduct, DR1Mask is 10% faster and 1 point in mAP more accurate than previous state-of-the-art instance segmentation network BlendMask.
arXiv Detail & Related papers (2020-11-19T12:42:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.