Symbolic Audio Classification via Modal Decision Tree Learning
- URL: http://arxiv.org/abs/2503.17018v1
- Date: Fri, 21 Mar 2025 10:27:16 GMT
- Title: Symbolic Audio Classification via Modal Decision Tree Learning
- Authors: Enrico Marzano, Giovanni Pagliarini, Riccardo Pasini, Guido Sciavicco, Ionel Eduard Stan,
- Abstract summary: In this work, we consider several audio tasks, namely, age and gender recognition, emotion classification, and respiratory disease diagnosis.<n>We approach them with a symbolic technique, that is, (modal) decision tree learning.<n>We prove that such tasks can be solved using the same symbolic pipeline, that allows to extract simple rules with very high accuracy and low complexity.
- Score: 0.5592394503914488
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: The range of potential applications of acoustic analysis is wide. Classification of sounds, in particular, is a typical machine learning task that received a lot of attention in recent years. The most common approaches to sound classification are sub-symbolic, typically based on neural networks, and result in black-box models with high performances but very low transparency. In this work, we consider several audio tasks, namely, age and gender recognition, emotion classification, and respiratory disease diagnosis, and we approach them with a symbolic technique, that is, (modal) decision tree learning. We prove that such tasks can be solved using the same symbolic pipeline, that allows to extract simple rules with very high accuracy and low complexity. In principle, all such tasks could be associated to an autonomous conversation system, which could be useful in different contexts, such as an automatic reservation agent for an hospital or a clinic.
Related papers
- Heterogeneous sound classification with the Broad Sound Taxonomy and Dataset [6.91815289914328]
This paper explores methodologies for automatically classifying heterogeneous sounds characterized by high intra-class variability.
We construct a dataset through manual annotation to ensure accuracy, diverse representation within each class and relevance in real-world scenarios.
Experimental results illustrate that audio embeddings encoding acoustic and semantic information achieve higher accuracy in the classification task.
arXiv Detail & Related papers (2024-10-01T18:09:02Z) - Self-supervised Learning for Acoustic Few-Shot Classification [10.180992026994739]
We introduce and evaluate a new architecture that combines CNN-based preprocessing with feature extraction based on state space models (SSMs)
We pre-train this architecture using contrastive learning on the actual task data and subsequent fine-tuning with an extremely small amount of labelled data.
Our evaluation shows that it outperforms state-of-the-art architectures on the few-shot classification problem.
arXiv Detail & Related papers (2024-09-15T07:45:11Z) - Influence based explainability of brain tumors segmentation in multimodal Magnetic Resonance Imaging [3.1994667952195273]
We focus on the segmentation of medical images task, where most explainability methods proposed so far provide a visual explanation in terms of an input saliency map.
The aim of this work is to extend, implement and test instead an influence-based explainability algorithm, TracIn, proposed originally for classification tasks.
arXiv Detail & Related papers (2024-04-05T17:07:21Z) - Show from Tell: Audio-Visual Modelling in Clinical Settings [58.88175583465277]
We consider audio-visual modelling in a clinical setting, providing a solution to learn medical representations without human expert annotation.
A simple yet effective multi-modal self-supervised learning framework is proposed for this purpose.
The proposed approach is able to localise anatomical regions of interest during ultrasound imaging, with only speech audio as a reference.
arXiv Detail & Related papers (2023-10-25T08:55:48Z) - Deep Feature Learning for Medical Acoustics [78.56998585396421]
The purpose of this paper is to compare different learnables in medical acoustics tasks.
A framework has been implemented to classify human respiratory sounds and heartbeats in two categories, i.e. healthy or affected by pathologies.
arXiv Detail & Related papers (2022-08-05T10:39:37Z) - Toward a realistic model of speech processing in the brain with
self-supervised learning [67.7130239674153]
Self-supervised algorithms trained on the raw waveform constitute a promising candidate.
We show that Wav2Vec 2.0 learns brain-like representations with as little as 600 hours of unlabelled speech.
arXiv Detail & Related papers (2022-06-03T17:01:46Z) - Interpreting deep urban sound classification using Layer-wise Relevance
Propagation [5.177947445379688]
This work focuses on the sensitive application of assisting drivers suffering from hearing loss by constructing a deep neural network for urban sound classification.
We use two different representations of audio signals, i.e. Mel and constant-Q spectrograms, while the decisions made by the deep neural network are explained via layer-wise relevance propagation.
Overall, we present an explainable AI framework for understanding deep urban sound classification.
arXiv Detail & Related papers (2021-11-19T14:15:45Z) - Preliminary study on using vector quantization latent spaces for TTS/VC
systems with consistent performance [55.10864476206503]
We investigate the use of quantized vectors to model the latent linguistic embedding.
By enforcing different policies over the latent spaces in the training, we are able to obtain a latent linguistic embedding.
Our experiments show that the voice cloning system built with vector quantization has only a small degradation in terms of perceptive evaluations.
arXiv Detail & Related papers (2021-06-25T07:51:35Z) - Respiratory Sound Classification Using Long-Short Term Memory [62.997667081978825]
This paper examines the difficulties that exist when attempting to perform sound classification as it relates to respiratory disease classification.
An examination on the use of deep learning and long short-term memory networks is performed in order to identify how such a task can be implemented.
arXiv Detail & Related papers (2020-08-06T23:11:57Z) - Towards Efficient Processing and Learning with Spikes: New Approaches
for Multi-Spike Learning [59.249322621035056]
We propose two new multi-spike learning rules which demonstrate better performance over other baselines on various tasks.
In the feature detection task, we re-examine the ability of unsupervised STDP with its limitations being presented.
Our proposed learning rules can reliably solve the task over a wide range of conditions without specific constraints being applied.
arXiv Detail & Related papers (2020-05-02T06:41:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.