Augmentation-Free Dense Contrastive Knowledge Distillation for Efficient
Semantic Segmentation
- URL: http://arxiv.org/abs/2312.04168v1
- Date: Thu, 7 Dec 2023 09:37:28 GMT
- Title: Augmentation-Free Dense Contrastive Knowledge Distillation for Efficient
Semantic Segmentation
- Authors: Jiawei Fan, Chao Li, Xiaolong Liu, Meina Song, Anbang Yao
- Abstract summary: Augmentation-free Dense Contrastive Knowledge Distillation (Af-DCD) is a new contrastive distillation learning paradigm.
Af-DCD trains compact and accurate deep neural networks for semantic segmentation applications.
- Score: 16.957139277317005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, knowledge distillation methods based on contrastive learning
have achieved promising results on image classification and object detection
tasks. However, in this line of research, we note that less attention is paid
to semantic segmentation. Existing methods heavily rely on data augmentation
and memory buffer, which entail high computational resource demands when
applying them to handle semantic segmentation that requires to preserve
high-resolution feature maps for making dense pixel-wise predictions. In order
to address this problem, we present Augmentation-free Dense Contrastive
Knowledge Distillation (Af-DCD), a new contrastive distillation learning
paradigm to train compact and accurate deep neural networks for semantic
segmentation applications. Af-DCD leverages a masked feature mimicking
strategy, and formulates a novel contrastive learning loss via taking advantage
of tactful feature partitions across both channel and spatial dimensions,
allowing to effectively transfer dense and structured local knowledge learnt by
the teacher model to a target student model while maintaining training
efficiency. Extensive experiments on five mainstream benchmarks with various
teacher-student network pairs demonstrate the effectiveness of our approach.
For instance, the DeepLabV3-Res18|DeepLabV3-MBV2 model trained by Af-DCD
reaches 77.03%|76.38% mIOU on Cityscapes dataset when choosing DeepLabV3-Res101
as the teacher, setting new performance records. Besides that, Af-DCD achieves
an absolute mIOU improvement of 3.26%|3.04%|2.75%|2.30%|1.42% compared with
individually trained counterpart on Cityscapes|Pascal
VOC|Camvid|ADE20K|COCO-Stuff-164K. Code is available at
https://github.com/OSVAI/Af-DCD
Related papers
- I2CKD : Intra- and Inter-Class Knowledge Distillation for Semantic Segmentation [1.433758865948252]
This paper proposes a new knowledge distillation method tailored for image semantic segmentation, termed Intra- and Inter-Class Knowledge Distillation (I2CKD)
The focus of this method is on capturing and transferring knowledge between the intermediate layers of teacher (cumbersome model) and student (compact model)
arXiv Detail & Related papers (2024-03-27T12:05:22Z) - Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud
Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding.
The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data.
We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z) - 2D Feature Distillation for Weakly- and Semi-Supervised 3D Semantic
Segmentation [92.17700318483745]
We propose an image-guidance network (IGNet) which builds upon the idea of distilling high level feature information from a domain adapted synthetically trained 2D semantic segmentation network.
IGNet achieves state-of-the-art results for weakly-supervised LiDAR semantic segmentation on ScribbleKITTI, boasting up to 98% relative performance to fully supervised training with only 8% labeled points.
arXiv Detail & Related papers (2023-11-27T07:57:29Z) - AICSD: Adaptive Inter-Class Similarity Distillation for Semantic
Segmentation [12.92102548320001]
This paper proposes a novel method called Inter-Class Similarity Distillation (ICSD) for the purpose of knowledge distillation.
The proposed method transfers high-order relations from the teacher network to the student network by independently computing intra-class distributions for each class from network outputs.
Experiments conducted on two well-known datasets for semantic segmentation, Cityscapes and Pascal VOC 2012, validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-08-08T13:17:20Z) - Foundation Model Drives Weakly Incremental Learning for Semantic
Segmentation [12.362400851574872]
Weakly incremental learning for semantic segmentation (WILSS) is a novel and attractive task.
We propose a novel and data-efficient framework for WILSS, named FMWISS.
arXiv Detail & Related papers (2023-02-28T02:21:42Z) - Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation [74.67594286008317]
This article addresses the problem of distilling knowledge from a large teacher model to a slim student network for LiDAR semantic segmentation.
We propose the Point-to-Voxel Knowledge Distillation (PVD), which transfers the hidden knowledge from both point level and voxel level.
arXiv Detail & Related papers (2022-06-05T05:28:32Z) - Distilling Inter-Class Distance for Semantic Segmentation [17.76592932725305]
We propose an Inter-class Distance Distillation (IDD) method to transfer the inter-class distance in the feature space from the teacher network to the student network.
Our method is helpful to improve the accuracy of semantic segmentation models and achieves the state-of-the-art performance.
arXiv Detail & Related papers (2022-05-07T13:13:55Z) - G-DetKD: Towards General Distillation Framework for Object Detectors via
Contrastive and Semantic-guided Feature Imitation [49.421099172544196]
We propose a novel semantic-guided feature imitation technique, which automatically performs soft matching between feature pairs across all pyramid levels.
We also introduce contrastive distillation to effectively capture the information encoded in the relationship between different feature regions.
Our method consistently outperforms the existing detection KD techniques, and works when (1) components in the framework are used separately and in conjunction.
arXiv Detail & Related papers (2021-08-17T07:44:27Z) - Dense Contrastive Learning for Self-Supervised Visual Pre-Training [102.15325936477362]
We present dense contrastive learning, which implements self-supervised learning by optimizing a pairwise contrastive (dis)similarity loss at the pixel level between two views of input images.
Compared to the baseline method MoCo-v2, our method introduces negligible computation overhead (only 1% slower)
arXiv Detail & Related papers (2020-11-18T08:42:32Z) - Deep Semi-supervised Knowledge Distillation for Overlapping Cervical
Cell Instance Segmentation [54.49894381464853]
We propose to leverage both labeled and unlabeled data for instance segmentation with improved accuracy by knowledge distillation.
We propose a novel Mask-guided Mean Teacher framework with Perturbation-sensitive Sample Mining.
Experiments show that the proposed method improves the performance significantly compared with the supervised method learned from labeled data only.
arXiv Detail & Related papers (2020-07-21T13:27:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.