AICSD: Adaptive Inter-Class Similarity Distillation for Semantic
Segmentation
- URL: http://arxiv.org/abs/2308.04243v1
- Date: Tue, 8 Aug 2023 13:17:20 GMT
- Title: AICSD: Adaptive Inter-Class Similarity Distillation for Semantic
Segmentation
- Authors: Amir M. Mansourian, Rozhan Ahmadi, Shohreh Kasaei
- Abstract summary: This paper proposes a novel method called Inter-Class Similarity Distillation (ICSD) for the purpose of knowledge distillation.
The proposed method transfers high-order relations from the teacher network to the student network by independently computing intra-class distributions for each class from network outputs.
Experiments conducted on two well-known datasets for semantic segmentation, Cityscapes and Pascal VOC 2012, validate the effectiveness of the proposed method.
- Score: 12.92102548320001
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, deep neural networks have achieved remarkable accuracy in
computer vision tasks. With inference time being a crucial factor, particularly
in dense prediction tasks such as semantic segmentation, knowledge distillation
has emerged as a successful technique for improving the accuracy of lightweight
student networks. The existing methods often neglect the information in
channels and among different classes. To overcome these limitations, this paper
proposes a novel method called Inter-Class Similarity Distillation (ICSD) for
the purpose of knowledge distillation. The proposed method transfers high-order
relations from the teacher network to the student network by independently
computing intra-class distributions for each class from network outputs. This
is followed by calculating inter-class similarity matrices for distillation
using KL divergence between distributions of each pair of classes. To further
improve the effectiveness of the proposed method, an Adaptive Loss Weighting
(ALW) training strategy is proposed. Unlike existing methods, the ALW strategy
gradually reduces the influence of the teacher network towards the end of
training process to account for errors in teacher's predictions. Extensive
experiments conducted on two well-known datasets for semantic segmentation,
Cityscapes and Pascal VOC 2012, validate the effectiveness of the proposed
method in terms of mIoU and pixel accuracy. The proposed method outperforms
most of existing knowledge distillation methods as demonstrated by both
quantitative and qualitative evaluations. Code is available at:
https://github.com/AmirMansurian/AICSD
Related papers
- Multi-Granularity Semantic Revision for Large Language Model Distillation [66.03746866578274]
We propose a multi-granularity semantic revision method for LLM distillation.
At the sequence level, we propose a sequence correction and re-generation strategy.
At the token level, we design a distribution adaptive clipping Kullback-Leibler loss as the distillation objective function.
At the span level, we leverage the span priors of a sequence to compute the probability correlations within spans, and constrain the teacher and student's probability correlations to be consistent.
arXiv Detail & Related papers (2024-07-14T03:51:49Z) - Contrastive-Adversarial and Diffusion: Exploring pre-training and fine-tuning strategies for sulcal identification [3.0398616939692777]
Techniques like adversarial learning, contrastive learning, diffusion denoising learning, and ordinary reconstruction learning have become standard.
The study aims to elucidate the advantages of pre-training techniques and fine-tuning strategies to enhance the learning process of neural networks.
arXiv Detail & Related papers (2024-05-29T15:44:51Z) - Uncertainty Estimation by Fisher Information-based Evidential Deep
Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications.
We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL)
In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z) - Feature-domain Adaptive Contrastive Distillation for Efficient Single
Image Super-Resolution [3.2453621806729234]
CNN-based SISR has numerous parameters and high computational cost to achieve better performance.
Knowledge Distillation (KD) transfers teacher's useful knowledge to student.
We propose a feature-domain adaptive contrastive distillation (FACD) method for efficiently training lightweight student SISR networks.
arXiv Detail & Related papers (2022-11-29T06:24:14Z) - Interpolation-based Contrastive Learning for Few-Label Semi-Supervised
Learning [43.51182049644767]
Semi-supervised learning (SSL) has long been proved to be an effective technique to construct powerful models with limited labels.
Regularization-based methods which force the perturbed samples to have similar predictions with the original ones have attracted much attention.
We propose a novel contrastive loss to guide the embedding of the learned network to change linearly between samples.
arXiv Detail & Related papers (2022-02-24T06:00:05Z) - Efficient training of lightweight neural networks using Online
Self-Acquired Knowledge Distillation [51.66271681532262]
Online Self-Acquired Knowledge Distillation (OSAKD) is proposed, aiming to improve the performance of any deep neural model in an online manner.
We utilize k-nn non-parametric density estimation technique for estimating the unknown probability distributions of the data samples in the output feature space.
arXiv Detail & Related papers (2021-08-26T14:01:04Z) - MCDAL: Maximum Classifier Discrepancy for Active Learning [74.73133545019877]
Recent state-of-the-art active learning methods have mostly leveraged Generative Adversarial Networks (GAN) for sample acquisition.
We propose in this paper a novel active learning framework that we call Maximum Discrepancy for Active Learning (MCDAL)
In particular, we utilize two auxiliary classification layers that learn tighter decision boundaries by maximizing the discrepancies among them.
arXiv Detail & Related papers (2021-07-23T06:57:08Z) - Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge
Distillation [12.097302014936655]
This paper proposes a novel self-knowledge distillation method, Feature Refinement via Self-Knowledge Distillation (FRSKD)
Our proposed method, FRSKD, can utilize both soft label and feature-map distillations for the self-knowledge distillation.
We demonstrate the effectiveness of FRSKD by enumerating its performance improvements in diverse tasks and benchmark datasets.
arXiv Detail & Related papers (2021-03-15T10:59:43Z) - Multi-head Knowledge Distillation for Model Compression [65.58705111863814]
We propose a simple-to-implement method using auxiliary classifiers at intermediate layers for matching features.
We show that the proposed method outperforms prior relevant approaches presented in the literature.
arXiv Detail & Related papers (2020-12-05T00:49:14Z) - Deep Semi-supervised Knowledge Distillation for Overlapping Cervical
Cell Instance Segmentation [54.49894381464853]
We propose to leverage both labeled and unlabeled data for instance segmentation with improved accuracy by knowledge distillation.
We propose a novel Mask-guided Mean Teacher framework with Perturbation-sensitive Sample Mining.
Experiments show that the proposed method improves the performance significantly compared with the supervised method learned from labeled data only.
arXiv Detail & Related papers (2020-07-21T13:27:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.