Investigation into respiratory sound classification for an imbalanced data set using hybrid LSTM-KAN architectures
- URL: http://arxiv.org/abs/2601.03610v1
- Date: Wed, 07 Jan 2026 05:37:57 GMT
- Title: Investigation into respiratory sound classification for an imbalanced data set using hybrid LSTM-KAN architectures
- Authors: Nithinkumar K., Anand R,
- Abstract summary: This study investigates respiratory sound classification with a focus on mitigating pronounced class imbalance.<n>We propose a hybrid deep learning model that combines a Long Short-Term Memory (LSTM) network for sequential feature encoding with a Kolmogorov-Arnold Network (KAN) for classification.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Respiratory sounds captured via auscultation contain critical clues for diagnosing pulmonary conditions. Automated classification of these sounds faces challenges due to subtle acoustic differences and severe class imbalance in clinical datasets. This study investigates respiratory sound classification with a focus on mitigating pronounced class imbalance. We propose a hybrid deep learning model that combines a Long Short-Term Memory (LSTM) network for sequential feature encoding with a Kolmogorov-Arnold Network (KAN) for classification. The model is integrated with a comprehensive feature extraction pipeline and targeted imbalance mitigation strategies. Experiments were conducted on a public respiratory sound database comprising six classes with a highly skewed distribution. Techniques such as focal loss, class-specific data augmentation, and Synthetic Minority Over-sampling Technique (SMOTE) were employed to enhance minority class recognition. The proposed Hybrid LSTM-KAN model achieves an overall accuracy of 94.6 percent and a macro-averaged F1 score of 0.703, despite the dominant COPD class accounting for over 86 percent of the data. Improved detection performance is observed for minority classes compared to baseline approaches, demonstrating the effectiveness of the proposed architecture for imbalanced respiratory sound classification.
Related papers
- Synthetic Data Augmentation for Medical Audio Classification: A Preliminary Evaluation [0.0]
Medical audio classification remains challenging due to low signal-to-noise ratios, subtle discriminative features, and substantial intra-class variability.<n>Synthetic data augmentation has been proposed as a potential strategy to mitigate these constraints.<n>In this study, we explore the impact of synthetic augmentation on respiratory sound classification using a baseline deep convolutional neural network trained on a moderately imbalanced dataset.
arXiv Detail & Related papers (2026-02-03T00:52:49Z) - Explainable Multi-Modal Deep Learning for Automatic Detection of Lung Diseases from Respiratory Audio Signals [0.49581497240446293]
This study presents an explainable multimodal deep learning framework for automatic lung-disease detection using respiratory audio signals.<n>The framework incorporates Grad-CAM, Integrated Gradients, and SHAP, generating interpretable spectral, temporal, and feature-level explanations.<n>The findings demonstrate the framework's potential for telemedicine, point-of-care diagnostics, and real-world respiratory screening.
arXiv Detail & Related papers (2025-11-29T17:15:58Z) - Benchmarking Foundation Models and Parameter-Efficient Fine-Tuning for Prognosis Prediction in Medical Imaging [40.35825564674249]
This study introduces the first structured benchmark to assess the robustness and efficiency of transfer learning strategies for Foundation Models.<n>Four publicly available COVID-19 chest X-ray datasets were used, covering mortality, severity, and admission.<n>CNNs pretrained on ImageNet and FMs pretrained on general or biomedical datasets were adapted using full finetuning, linear probing, and parameter-efficient methods.
arXiv Detail & Related papers (2025-06-23T09:16:04Z) - CycleGuardian: A Framework for Automatic RespiratorySound classification Based on Improved Deep clustering and Contrastive Learning [9.215130010602634]
Auscultation plays a pivotal role in early respiratory and pulmonary disease diagnosis.<n>Existing state-of-the-art models suffer from excessive parameter size, impeding deployment on resource-constrained mobile platforms.<n>We propose a framework based on an improved deep clustering and contrastive learning.<n>We deploy the network on Android devices, showcasing a comprehensive intelligent respiratory sound auscultation system.
arXiv Detail & Related papers (2025-02-02T09:56:47Z) - Respiratory Disease Classification and Biometric Analysis Using Biosignals from Digital Stethoscopes [3.2458203725405976]
This work presents a novel approach leveraging digital stethoscope technology for automatic respiratory disease classification and biometric analysis.
By leveraging one of the largest publicly available medical database of respiratory sounds, we train machine learning models to classify various respiratory health conditions.
Our approach achieves high accuracy in both binary classification (89% balanced accuracy for healthy vs. diseased) and multi-class classification (72% balanced accuracy for specific diseases like pneumonia and COPD)
arXiv Detail & Related papers (2023-09-12T23:54:00Z) - Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification [18.56326840619165]
We introduce a novel and effective Patch-Mix Contrastive Learning to distinguish the mixed representations in the latent space.<n>Our method achieves state-of-the-art performance on the ICBHI dataset, outperforming the prior leading score by an improvement of 4.08%.
arXiv Detail & Related papers (2023-05-23T13:04:07Z) - Successive Subspace Learning for Cardiac Disease Classification with
Two-phase Deformation Fields from Cine MRI [36.044984400761535]
This work proposes a lightweight successive subspace learning framework for CVD classification.
It is based on an interpretable feedforward design, in conjunction with a cardiac atlas.
Compared with 3D CNN-based approaches, our framework achieves superior classification performance with 140$times$ fewer parameters.
arXiv Detail & Related papers (2023-01-21T15:00:59Z) - Low-complexity deep learning frameworks for acoustic scene
classification [64.22762153453175]
We present low-complexity deep learning frameworks for acoustic scene classification (ASC)
The proposed frameworks can be separated into four main steps: Front-end spectrogram extraction, online data augmentation, back-end classification, and late fusion of predicted probabilities.
Our experiments conducted on DCASE 2022 Task 1 Development dataset have fullfiled the requirement of low-complexity and achieved the best classification accuracy of 60.1%.
arXiv Detail & Related papers (2022-06-13T11:41:39Z) - Multiple Time Series Fusion Based on LSTM An Application to CAP A Phase
Classification Using EEG [56.155331323304]
Deep learning based electroencephalogram channels' feature level fusion is carried out in this work.
Channel selection, fusion, and classification procedures were optimized by two optimization algorithms.
arXiv Detail & Related papers (2021-12-18T14:17:49Z) - A Lottery Ticket Hypothesis Framework for Low-Complexity Device-Robust
Neural Acoustic Scene Classification [78.04177357888284]
We propose a novel neural model compression strategy combining data augmentation, knowledge transfer, pruning, and quantization for device-robust acoustic scene classification (ASC)
We report an efficient joint framework for low-complexity multi-device ASC, called Acoustic Lottery.
arXiv Detail & Related papers (2021-07-03T16:25:24Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Capturing scattered discriminative information using a deep architecture
in acoustic scene classification [49.86640645460706]
In this study, we investigate various methods to capture discriminative information and simultaneously mitigate the overfitting problem.
We adopt a max feature map method to replace conventional non-linear activations in a deep neural network.
Two data augment methods and two deep architecture modules are further explored to reduce overfitting and sustain the system's discriminative power.
arXiv Detail & Related papers (2020-07-09T08:32:06Z) - Predictive Modeling of ICU Healthcare-Associated Infections from
Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling
Approach [55.41644538483948]
This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units.
The aim is to support decision making addressed at reducing the incidence rate of infections.
arXiv Detail & Related papers (2020-05-07T16:13:12Z) - CNN-MoE based framework for classification of respiratory anomalies and
lung disease detection [33.45087488971683]
This paper presents and explores a robust deep learning framework for auscultation analysis.
It aims to classify anomalies in respiratory cycles and detect disease, from respiratory sound recordings.
arXiv Detail & Related papers (2020-04-04T21:45:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.