Low-complexity deep learning frameworks for acoustic scene
classification using teacher-student scheme and multiple spectrograms
- URL: http://arxiv.org/abs/2305.09463v1
- Date: Tue, 16 May 2023 14:21:45 GMT
- Title: Low-complexity deep learning frameworks for acoustic scene
classification using teacher-student scheme and multiple spectrograms
- Authors: Lam Pham, Dat Ngo, Cam Le, Anahid Jalali, Alexander Schindler
- Abstract summary: The proposed system comprises two main phases: (Phase I) Training a teacher network; and (Phase II) training a student network using distilled knowledge from the teacher.
Our experiments conducted on DCASE 2023 Task 1 Development dataset have fulfilled the requirement of low-complexity and achieved the best classification accuracy of 57.4%.
- Score: 59.86658316440461
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this technical report, a low-complexity deep learning system for acoustic
scene classification (ASC) is presented. The proposed system comprises two main
phases: (Phase I) Training a teacher network; and (Phase II) training a student
network using distilled knowledge from the teacher. In the first phase, the
teacher, which presents a large footprint model, is trained. After training the
teacher, the embeddings, which are the feature map of the second last layer of
the teacher, are extracted. In the second phase, the student network, which
presents a low complexity model, is trained with the embeddings extracted from
the teacher. Our experiments conducted on DCASE 2023 Task 1 Development dataset
have fulfilled the requirement of low-complexity and achieved the best
classification accuracy of 57.4%, improving DCASE baseline by 14.5%.
Related papers
- Data Efficient Acoustic Scene Classification using Teacher-Informed Confusing Class Instruction [11.15868814062321]
Three systems are introduced to tackle training splits of different sizes.
For small training splits, we explored reducing the complexity of the provided baseline model by reducing the number of base channels.
For the larger training splits, we use FocusNet to provide confusing class information to an ensemble of multiple Patchout faSt Spectrogram Transformer (PaSST) models and baseline models trained on the original sampling rate of 44.1 kHz.
arXiv Detail & Related papers (2024-09-18T13:16:00Z) - Distantly-Supervised Named Entity Recognition with Adaptive Teacher
Learning and Fine-grained Student Ensemble [56.705249154629264]
Self-training teacher-student frameworks are proposed to improve the robustness of NER models.
In this paper, we propose an adaptive teacher learning comprised of two teacher-student networks.
Fine-grained student ensemble updates each fragment of the teacher model with a temporal moving average of the corresponding fragment of the student, which enhances consistent predictions on each model fragment against noise.
arXiv Detail & Related papers (2022-12-13T12:14:09Z) - Knowledge Distillation Meets Open-Set Semi-Supervised Learning [69.21139647218456]
We propose a novel em modelname (bfem shortname) method dedicated for distilling representational knowledge semantically from a pretrained teacher to a target student.
At the problem level, this establishes an interesting connection between knowledge distillation with open-set semi-supervised learning (SSL)
Our shortname outperforms significantly previous state-of-the-art knowledge distillation methods on both coarse object classification and fine face recognition tasks.
arXiv Detail & Related papers (2022-05-13T15:15:27Z) - Weakly Supervised Semantic Segmentation via Alternative Self-Dual
Teaching [82.71578668091914]
This paper establishes a compact learning framework that embeds the classification and mask-refinement components into a unified deep model.
We propose a novel alternative self-dual teaching (ASDT) mechanism to encourage high-quality knowledge interaction.
arXiv Detail & Related papers (2021-12-17T11:56:56Z) - Student Helping Teacher: Teacher Evolution via Self-Knowledge
Distillation [20.17325172100031]
We propose a novel student-helping-teacher formula, Teacher Evolution via Self-Knowledge Distillation (TESKD), where the target teacher is learned with the help of multiple hierarchical students by sharing the structural backbone.
The effectiveness of our proposed framework is demonstrated by extensive experiments with various network settings on two standard benchmarks including CIFAR-100 and ImageNet.
arXiv Detail & Related papers (2021-10-01T11:46:12Z) - On the Efficiency of Subclass Knowledge Distillation in Classification
Tasks [33.1278647424578]
Subclass Knowledge Distillation (SKD) framework is a process of transferring the subclasses' prediction knowledge from a large teacher model into a smaller student one.
The framework is evaluated in clinical application, namely colorectal polyp binary classification.
A lightweight, low complexity student trained with the proposed framework achieves an F1-score of 85.05%, an improvement of 2.14% and 1.49% gain over the student that trains without.
arXiv Detail & Related papers (2021-09-12T19:04:44Z) - Distilling EEG Representations via Capsules for Affective Computing [14.67085109524245]
We propose a novel knowledge distillation pipeline to distill EEG representations via capsule-based architectures.
Our framework consistently enables student networks with different compression ratios to effectively learn from the teacher.
Our method achieves state-of-the-art results on one of the two datasets.
arXiv Detail & Related papers (2021-04-30T22:04:35Z) - Differentiable Feature Aggregation Search for Knowledge Distillation [47.94874193183427]
We introduce the feature aggregation to imitate the multi-teacher distillation in the single-teacher distillation framework.
DFA is a two-stage Differentiable Feature Aggregation search method motivated by DARTS in neural architecture search.
Experimental results show that DFA outperforms existing methods on CIFAR-100 and CINIC-10 datasets.
arXiv Detail & Related papers (2020-08-02T15:42:29Z) - Device-Robust Acoustic Scene Classification Based on Two-Stage
Categorization and Data Augmentation [63.98724740606457]
We present a joint effort of four groups, namely GT, USTC, Tencent, and UKE, to tackle Task 1 - Acoustic Scene Classification (ASC) in the DCASE 2020 Challenge.
Task 1a focuses on ASC of audio signals recorded with multiple (real and simulated) devices into ten different fine-grained classes.
Task 1b concerns with classification of data into three higher-level classes using low-complexity solutions.
arXiv Detail & Related papers (2020-07-16T15:07:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.