Improving snore detection under limited dataset through harmonic/percussive source separation and convolutional neural networks
- URL: http://arxiv.org/abs/2410.23796v1
- Date: Thu, 31 Oct 2024 10:27:48 GMT
- Title: Improving snore detection under limited dataset through harmonic/percussive source separation and convolutional neural networks
- Authors: F. D. Gonzalez-Martinez, J. J. Carabias-Orti, F. J. Canadas-Quesada, N. Ruiz-Reyes, D. Martinez-Munoz, S. Garcia-Galan,
- Abstract summary: Snoring is an acoustic biomarker commonly observed in individuals with Obstructive Sleep Apnoea Syndrome (OSAS)
We propose a novel method to differentiate monaural snoring from non-snoring sounds by analyzing the harmonic content of the input sound.
- Score: 0.0
- License:
- Abstract: Snoring, an acoustic biomarker commonly observed in individuals with Obstructive Sleep Apnoea Syndrome (OSAS), holds significant potential for diagnosing and monitoring this recognized clinical disorder. Irrespective of snoring types, most snoring instances exhibit identifiable harmonic patterns manifested through distinctive energy distributions over time. In this work, we propose a novel method to differentiate monaural snoring from non-snoring sounds by analyzing the harmonic content of the input sound using harmonic/percussive sound source separation (HPSS). The resulting feature, based on the harmonic spectrogram from HPSS, is employed as input data for conventional neural network architectures, aiming to enhance snoring detection performance even under a limited data learning framework. To evaluate the performance of our proposal, we studied two different scenarios: 1) using a large dataset of snoring and interfering sounds, and 2) using a reduced training set composed of around 1% of the data material. In the former scenario, the proposed HPSS-based feature provides competitive results compared to other input features from the literature. However, the key advantage of the proposed method lies in the superior performance of the harmonic spectrogram derived from HPSS in a limited data learning context. In this particular scenario, using the proposed harmonic feature significantly enhances the performance of all the studied architectures in comparison to the classical input features documented in the existing literature. This finding clearly demonstrates that incorporating harmonic content enables more reliable learning of the essential time-frequency characteristics that are prevalent in most snoring sounds, even in scenarios where the amount of training data is limited.
Related papers
- A Lightweight and Real-Time Binaural Speech Enhancement Model with Spatial Cues Preservation [19.384404014248762]
Binaural speech enhancement aims to improve the speech quality and intelligibility of noisy signals received by hearing devices.
Existing methods often suffer from the compromise between noise reduction (NR) capacity and spatial cues ( SCP) accuracy and preservation.
We present a learning-based lightweight complex convolutional network (LBCCN) which excels in NR by filtering low-frequency bands and keeping the rest.
arXiv Detail & Related papers (2024-09-19T03:52:50Z) - A Deep Learning Approach to Localizing Multi-level Airway Collapse Based on Snoring Sounds [1.165734481380989]
This study investigates the application of machine/deep learning to classify snoring sounds excited at different levels of the upper airway in patients with obstructive sleep apnea (OSA)
The snoring sounds of 39 subjects were analyzed and labeled according to the Velum, Oropharynx, Tongue Base, and Epiglottis (VOTE) classification system.
The ResNet-50, a convolutional neural network (CNN), showed the best overall performance in classifying snoring acoustics.
arXiv Detail & Related papers (2024-08-28T09:30:20Z) - BTS: Bridging Text and Sound Modalities for Metadata-Aided Respiratory Sound Classification [0.0]
We fine-tune a pretrained text-audio multimodal model using free-text descriptions derived from the sound samples' metadata.
Our method achieves state-of-the-art performance on the ICBHI dataset, surpassing the previous best result by a notable margin of 1.17%.
arXiv Detail & Related papers (2024-06-10T20:49:54Z) - Deep Feature Learning for Medical Acoustics [78.56998585396421]
The purpose of this paper is to compare different learnables in medical acoustics tasks.
A framework has been implemented to classify human respiratory sounds and heartbeats in two categories, i.e. healthy or affected by pathologies.
arXiv Detail & Related papers (2022-08-05T10:39:37Z) - Heart Sound Classification Considering Additive Noise and Convolutional
Distortion [2.63046959939306]
Automatic analysis of heart sounds for abnormality detection is faced with the challenges of additive noise and sensor-dependent degradation.
This paper aims to develop methods to address the cardiac abnormality detection problem when both types of distortions are present in the cardiac auscultation sound.
The proposed method paves the way towards developing computer-aided cardiac auscultation systems in noisy environments using low-cost stethoscopes.
arXiv Detail & Related papers (2021-06-03T14:09:04Z) - Sequence-to-sequence Singing Voice Synthesis with Perceptual Entropy
Loss [49.62291237343537]
We propose a Perceptual Entropy (PE) loss derived from a psycho-acoustic hearing model to regularize the network.
With a one-hour open-source singing voice database, we explore the impact of the PE loss on various mainstream sequence-to-sequence models.
arXiv Detail & Related papers (2020-10-22T20:14:59Z) - Capturing scattered discriminative information using a deep architecture
in acoustic scene classification [49.86640645460706]
In this study, we investigate various methods to capture discriminative information and simultaneously mitigate the overfitting problem.
We adopt a max feature map method to replace conventional non-linear activations in a deep neural network.
Two data augment methods and two deep architecture modules are further explored to reduce overfitting and sustain the system's discriminative power.
arXiv Detail & Related papers (2020-07-09T08:32:06Z) - RDP-GAN: A R\'enyi-Differential Privacy based Generative Adversarial
Network [75.81653258081435]
Generative adversarial network (GAN) has attracted increasing attention recently owing to its impressive ability to generate realistic samples with high privacy protection.
However, when GANs are applied on sensitive or private training examples, such as medical or financial records, it is still probable to divulge individuals' sensitive and private information.
We propose a R'enyi-differentially private-GAN (RDP-GAN), which achieves differential privacy (DP) in a GAN by carefully adding random noises on the value of the loss function during training.
arXiv Detail & Related papers (2020-07-04T09:51:02Z) - Sleep Stage Scoring Using Joint Frequency-Temporal and Unsupervised
Features [5.104181562775778]
A number of Automatic Sleep Stage Recognition (ASSR) methods have been proposed.
Most of these methods use temporal-frequency features that have been extracted from the vital signals.
Recently, some ASSR methods have been proposed which use deep neural networks for unsupervised feature extraction.
In this paper, we proposed to combine the two ideas and use both temporal-frequency and unsupervised features at the same time.
arXiv Detail & Related papers (2020-04-10T02:00:29Z) - Simultaneous Denoising and Dereverberation Using Deep Embedding Features [64.58693911070228]
We propose a joint training method for simultaneous speech denoising and dereverberation using deep embedding features.
At the denoising stage, the DC network is leveraged to extract noise-free deep embedding features.
At the dereverberation stage, instead of using the unsupervised K-means clustering algorithm, another neural network is utilized to estimate the anechoic speech.
arXiv Detail & Related papers (2020-04-06T06:34:01Z) - ADRN: Attention-based Deep Residual Network for Hyperspectral Image
Denoising [52.01041506447195]
We propose an attention-based deep residual network to learn a mapping from noisy HSI to the clean one.
Experimental results demonstrate that our proposed ADRN scheme outperforms the state-of-the-art methods both in quantitative and visual evaluations.
arXiv Detail & Related papers (2020-03-04T08:36:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.