Pediatric Asthma Detection with Googles HeAR Model: An AI-Driven Respiratory Sound Classifier
- URL: http://arxiv.org/abs/2504.20124v1
- Date: Mon, 28 Apr 2025 12:52:17 GMT
- Title: Pediatric Asthma Detection with Googles HeAR Model: An AI-Driven Respiratory Sound Classifier
- Authors: Abul Ehtesham, Saket Kumar, Aditi Singh, Tala Talaei Khoei,
- Abstract summary: This work presents an AI-powered diagnostic pipeline to detect early signs of asthma from pediatric respiratory sounds.<n>The SPRSound dataset is used to extract 2-second audio segments labeled as wheeze, crackle, rhonchi, stridor, or normal.<n>The system achieves over 91% accuracy, with strong performance on precision-recall metrics for positive cases.
- Score: 0.8463972278020965
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Early detection of asthma in children is crucial to prevent long-term respiratory complications and reduce emergency interventions. This work presents an AI-powered diagnostic pipeline that leverages Googles Health Acoustic Representations (HeAR) model to detect early signs of asthma from pediatric respiratory sounds. The SPRSound dataset, the first open-access collection of annotated respiratory sounds in children aged 1 month to 18 years, is used to extract 2-second audio segments labeled as wheeze, crackle, rhonchi, stridor, or normal. Each segment is embedded into a 512-dimensional representation using HeAR, a foundation model pretrained on 300 million health-related audio clips, including 100 million cough sounds. Multiple classifiers, including SVM, Random Forest, and MLP, are trained on these embeddings to distinguish between asthma-indicative and normal sounds. The system achieves over 91\% accuracy, with strong performance on precision-recall metrics for positive cases. In addition to classification, learned embeddings are visualized using PCA, misclassifications are analyzed through waveform playback, and ROC and confusion matrix insights are provided. This method demonstrates that short, low-resource pediatric recordings, when powered by foundation audio models, can enable fast, noninvasive asthma screening. The approach is especially promising for digital diagnostics in remote or underserved healthcare settings.
Related papers
- Respiratory Inhaler Sound Event Classification Using Self-Supervised Learning [43.83039192442981]
Asthma is a chronic respiratory condition that affects millions of people worldwide.<n>We adapted the wav2vec 2.0 self-supervised learning model for inhaler sound classification by pre-training and fine-tuning this model on inhaler sounds.<n>The proposed model shows a balanced accuracy of 98% on a dataset collected using a dry powder inhaler and smartwatch device.
arXiv Detail & Related papers (2025-04-15T14:44:47Z) - Pre-Trained Foundation Model representations to uncover Breathing patterns in Speech [2.935056044470713]
Respiratory rate (RR) is a vital metric that is used to assess the overall health, fitness, and general well-being of an individual.
Existing approaches to measure RR are performed using specialized equipment or training.
Studies have demonstrated that machine learning algorithms can be used to estimate RR using bio-sensor signals as input.
arXiv Detail & Related papers (2024-07-17T21:57:18Z) - Rene: A Pre-trained Multi-modal Architecture for Auscultation of Respiratory Diseases [5.810320353233697]
We introduce Rene, a pioneering large-scale model tailored for respiratory sound recognition.
Our innovative approach applies a pre-trained speech recognition model to process respiratory sounds.
We have developed a real-time respiratory sound discrimination system utilizing the Rene architecture.
arXiv Detail & Related papers (2024-05-13T03:00:28Z) - COVID-19 Detection System: A Comparative Analysis of System Performance Based on Acoustic Features of Cough Audio Signals [0.6963971634605796]
This research aims to explore various acoustic features that enhance the performance of machine learning (ML) models in detecting COVID-19 from cough signals.
It investigates the efficacy of three feature extraction techniques, including Mel Frequency Cepstral Coefficients (MFCC), Chroma, and Spectral Contrast features, when applied to two machine learning algorithms, Support Vector Machine (SVM) and Multilayer Perceptron (MLP)
The proposed system provides a practical solution and demonstrates state-of-the-art classification performance, with an AUC of 0.843 on the COUGHVID dataset and 0.953 on the Virufy
arXiv Detail & Related papers (2023-09-08T08:33:24Z) - Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification [18.56326840619165]
We introduce a novel and effective Patch-Mix Contrastive Learning to distinguish the mixed representations in the latent space.<n>Our method achieves state-of-the-art performance on the ICBHI dataset, outperforming the prior leading score by an improvement of 4.08%.
arXiv Detail & Related papers (2023-05-23T13:04:07Z) - Deep Feature Learning for Medical Acoustics [78.56998585396421]
The purpose of this paper is to compare different learnables in medical acoustics tasks.
A framework has been implemented to classify human respiratory sounds and heartbeats in two categories, i.e. healthy or affected by pathologies.
arXiv Detail & Related papers (2022-08-05T10:39:37Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z) - Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based
Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E)
We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z) - Detecting COVID-19 from Breathing and Coughing Sounds using Deep Neural
Networks [68.8204255655161]
We adapt an ensemble of Convolutional Neural Networks to classify if a speaker is infected with COVID-19 or not.
Ultimately, it achieves an Unweighted Average Recall (UAR) of 74.9%, or an Area Under ROC Curve (AUC) of 80.7% by ensembling neural networks.
arXiv Detail & Related papers (2020-12-29T01:14:17Z) - Identification of deep breath while moving forward based on multiple
body regions and graph signal analysis [45.62293065676075]
This paper presents an unobtrusive solution that can automatically identify deep breath when a person is walking past the global depth camera.
In validation experiments, the proposed approach outperforms the comparative methods with the accuracy, precision, recall and F1 of 75.5%, 76.2%, 75.0% and 75.2%, respectively.
arXiv Detail & Related papers (2020-10-20T08:26:50Z) - Respiratory Sound Classification Using Long-Short Term Memory [62.997667081978825]
This paper examines the difficulties that exist when attempting to perform sound classification as it relates to respiratory disease classification.
An examination on the use of deep learning and long short-term memory networks is performed in order to identify how such a task can be implemented.
arXiv Detail & Related papers (2020-08-06T23:11:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.