Related papers: Compressing Quaternion Convolutional Neural Networks for Audio Classification

Compressing Quaternion Convolutional Neural Networks for Audio Classification

URL: http://arxiv.org/abs/2510.21388v1
Date: Fri, 24 Oct 2025 12:19:19 GMT
Title: Compressing Quaternion Convolutional Neural Networks for Audio Classification
Authors: Arshdeep Singh, Vinayak Abrol, Mark D. Plumbley,
Abstract summary: Quaternion Convolutional Neural Networks (QCNNs) have been widely used for audio classification.<n>This study explores knowledge distillation (KD) and pruning, to reduce the computational complexity of QCNNs while maintaining performance.<n>Our experiments on audio classification reveal that pruning QCNNs achieves similar or superior performance compared to KD while requiring less computational effort.
Score: 25.584905224642288
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Conventional Convolutional Neural Networks (CNNs) in the real domain have been widely used for audio classification. However, their convolution operations process multi-channel inputs independently, limiting the ability to capture correlations among channels. This can lead to suboptimal feature learning, particularly for complex audio patterns such as multi-channel spectrogram representations. Quaternion Convolutional Neural Networks (QCNNs) address this limitation by employing quaternion algebra to jointly capture inter-channel dependencies, enabling more compact models with fewer learnable parameters while better exploiting the multi-dimensional nature of audio signals. However, QCNNs exhibit higher computational complexity due to the overhead of quaternion operations, resulting in increased inference latency and reduced efficiency compared to conventional CNNs, posing challenges for deployment on resource-constrained platforms. To address this challenge, this study explores knowledge distillation (KD) and pruning, to reduce the computational complexity of QCNNs while maintaining performance. Our experiments on audio classification reveal that pruning QCNNs achieves similar or superior performance compared to KD while requiring less computational effort. Compared to conventional CNNs and Transformer-based architectures, pruned QCNNs achieve competitive performance with a reduced learnable parameter count and computational complexity. On the AudioSet dataset, pruned QCNNs reduce computational cost by 50\% and parameter count by 80\%, while maintaining performance comparable to the conventional CNNs. Furthermore, pruned QCNNs generalize well across multiple audio classification benchmarks, including GTZAN for music genre recognition, ESC-50 for environmental sound classification and RAVDESS for speech emotion recognition.

Related papers

A Comparative Study of Encoding Strategies for Quantum Convolutional Neural Networks [0.0]
Quantum convolutional neural networks (QCNNs) offer a promising architecture for near-term quantum machine learning.<n>However, any QCNN operating on classical data must rely on an encoding scheme to embed inputs into quantum states.<n>This work presents an implementation-level comparison of three representative encodings -- Angle, Amplitude, and a Hybrid phase/angle scheme.
arXiv Detail & Related papers (2025-12-14T01:31:16Z)
Raw Audio Classification with Cosine Convolutional Neural Network (CosCovNN) [1.0237120900821557]
This study introduces the Cosine Convolutional Neural Network (CosCovNN) replacing the traditional CNN filters with Cosine filters.<n>The CosCovNN surpasses the accuracy of the equivalent CNN architectures with approximately $77%$ less parameters.<n>Our findings show that cosine filters can greatly improve the efficiency and accuracy of CNNs in raw audio classification.
arXiv Detail & Related papers (2024-11-30T01:39:16Z)
Benchmarking Quantum Convolutional Neural Networks for Classification and Data Compression Tasks [0.4379805041989628]
Quantum Convolutional Neural Networks (QCNNs) have emerged as promising models for quantum machine learning tasks. This paper investigates the performance of QCNNs in comparison to the hardware-efficient ansatz (HEA) for classifying the phases of quantum ground states.
arXiv Detail & Related papers (2024-11-20T17:17:09Z)
Quantum-Trained Convolutional Neural Network for Deepfake Audio Detection [3.2927352068925444]
deepfake technologies pose challenges to privacy, security, and information integrity. This paper introduces a Quantum-Trained Convolutional Neural Network framework designed to enhance the detection of deepfake audio.
arXiv Detail & Related papers (2024-10-11T20:52:10Z)
A Quantum Convolutional Neural Network Approach for Object Detection and Classification [0.0]
The time and accuracy of QCNNs are compared with classical CNNs and ANN models under different conditions. The analysis shows that QCNNs have the potential to outperform both classical CNNs and ANN models in terms of accuracy and efficiency for certain applications.
arXiv Detail & Related papers (2023-07-17T02:38:04Z)
Attention-based Feature Compression for CNN Inference Offloading in Edge Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems. We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device. Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z)
Spiking Neural Network Decision Feedback Equalization [70.3497683558609]
We propose an SNN-based equalizer with a feedback structure akin to the decision feedback equalizer (DFE) We show that our approach clearly outperforms conventional linear equalizers for three different exemplary channels. The proposed SNN with a decision feedback structure enables the path to competitive energy-efficient transceivers.
arXiv Detail & Related papers (2022-11-09T09:19:15Z)
Hybrid SNN-ANN: Energy-Efficient Classification and Object Detection for Event-Based Vision [64.71260357476602]
Event-based vision sensors encode local pixel-wise brightness changes in streams of events rather than image frames. Recent progress in object recognition from event-based sensors has come from conversions of deep neural networks. We propose a hybrid architecture for end-to-end training of deep neural networks for event-based pattern recognition and object detection.
arXiv Detail & Related papers (2021-12-06T23:45:58Z)
Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain. In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden. Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z)
Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech Recognition [101.69873988328808]
We build upon a quantum convolutional neural network (QCNN) composed of a quantum circuit encoder for feature extraction. An input speech is first up-streamed to a quantum computing server to extract Mel-spectrogram. The corresponding convolutional features are encoded using a quantum circuit algorithm with random parameters. The encoded features are then down-streamed to the local RNN model for the final recognition.
arXiv Detail & Related papers (2020-10-26T03:36:01Z)
Depthwise Separable Convolutions Versus Recurrent Neural Networks for Monaural Singing Voice Separation [17.358040670413505]
We focus on singing voice separation, employing an RNN architecture, and we replace the RNNs with DWS convolutions (DWS-CNNs) We conduct an ablation study and examine the effect of the number of channels and layers of DWS-CNNs on the source separation performance. Our results show that by replacing RNNs with DWS-CNNs yields an improvement of 1.20, 0.06, 0.37 dB, respectively, while using only 20.57% of the amount of parameters of the RNN architecture.
arXiv Detail & Related papers (2020-07-06T12:32:34Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.