Related papers: Ear-Keeper: Real-time Diagnosis of Ear Lesions Utilizing Ultralight-Ultrafast ConvNet and Large-scale Ear Endoscopic Dataset

Ear-Keeper: Real-time Diagnosis of Ear Lesions Utilizing Ultralight-Ultrafast ConvNet and Large-scale Ear Endoscopic Dataset

URL: http://arxiv.org/abs/2308.10610v4
Date: Wed, 10 Apr 2024 08:16:18 GMT
Title: Ear-Keeper: Real-time Diagnosis of Ear Lesions Utilizing Ultralight-Ultrafast ConvNet and Large-scale Ear Endoscopic Dataset
Authors: Yubiao Yue, Xinyu Zeng, Xiaoqiang Shi, Meiping Zhang, Fan Zhang, Yunxin Liang, Yan Liu, Zhenzhang Li, Yang Li,
Abstract summary: We propose Best-EarNet, an ultrafast and ultralight network enabling real-time ear disease diagnosis. The accuracy of Best-EarNet with only 0.77M parameters achieves 95.23% (internal 22,581 images) and 92.14% (external 1,652 images) Ear-Keeper, an intelligent diagnosis system based Best-EarNet, was developed successfully and deployed on common electronic devices.
Score: 7.5179664143779075
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning-based ear disease diagnosis technology has proven effective and affordable. However, due to the lack of ear endoscope datasets with diversity, the practical potential of the deep learning model has not been thoroughly studied. Moreover, existing research failed to achieve a good trade-off between model inference speed and parameter size, rendering models inapplicable in real-world settings. To address these challenges, we constructed the first large-scale ear endoscopic dataset comprising eight types of ear diseases and disease-free samples from two institutions. Inspired by ShuffleNetV2, we proposed Best-EarNet, an ultrafast and ultralight network enabling real-time ear disease diagnosis. Best-EarNet incorporates a novel Local-Global Spatial Feature Fusion Module and multi-scale supervision strategy, which facilitates the model focusing on global-local information within feature maps at various levels. Utilizing transfer learning, the accuracy of Best-EarNet with only 0.77M parameters achieves 95.23% (internal 22,581 images) and 92.14% (external 1,652 images), respectively. In particular, it achieves an average frame per second of 80 on the CPU. From the perspective of model practicality, the proposed Best-EarNet is superior to state-of-the-art backbone models in ear lesion detection tasks. Most importantly, Ear-keeper, an intelligent diagnosis system based Best-EarNet, was developed successfully and deployed on common electronic devices (smartphone, tablet computer and personal computer). In the future, Ear-Keeper has the potential to assist the public and healthcare providers in performing comprehensive scanning and diagnosis of the ear canal in real-time video, thereby promptly detecting ear lesions.

Related papers

Detection of Disease on Nasal Breath Sound by New Lightweight Architecture: Using COVID-19 as An Example [4.618578603062536]
Infectious diseases, particularly COVID-19, continue to be a significant global health issue. This study aims to develop a novel, lightweight deep neural network for efficient, accurate, and cost-effective detection of COVID-19 using a nasal breathing audio data collected via smartphones.
arXiv Detail & Related papers (2025-04-01T12:41:53Z)
GONet: A Generalizable Deep Learning Model for Glaucoma Detection [2.0521974107551535]
Glaucomatous optic neuropathy (GON) is a prevalent ocular disease that can lead to irreversible vision loss if not detected early and treated. Recent deep learning models for automating GON detection from digital fundus images have shown promise but often suffer from limited generalizability. We introduce GONet, a robust deep learning model developed using seven independent datasets.
arXiv Detail & Related papers (2025-02-26T19:28:09Z)
Privacy-Preserving Federated Foundation Model for Generalist Ultrasound Artificial Intelligence [83.02106623401885]
We present UltraFedFM, an innovative privacy-preserving ultrasound foundation model. UltraFedFM is collaboratively pre-trained using federated learning across 16 distributed medical institutions in 9 countries. It achieves an average area under the receiver operating characteristic curve of 0.927 for disease diagnosis and a dice similarity coefficient of 0.878 for lesion segmentation.
arXiv Detail & Related papers (2024-11-25T13:40:11Z)
MADE-for-ASD: A Multi-Atlas Deep Ensemble Network for Diagnosing Autism Spectrum Disorder [4.7377709803078325]
This paper bridges the gap between traditional, time-consuming diagnostic methods and potential automated solutions. We propose a multi-atlas deep ensemble network, MADE-for-ASD, that integrates multiple atlases of the brain's functional magnetic resonance imaging (fMRI) data. Our approach integrates demographic information into the prediction workflow, which enhances ASD diagnosis performance.
arXiv Detail & Related papers (2024-07-09T17:49:23Z)
Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology. For training, we assemble a large dataset of over 697 thousand radiology image-text pairs. For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation. The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z)
Detecting Speech Abnormalities with a Perceiver-based Sequence Classifier that Leverages a Universal Speech Model [4.503292461488901]
We propose a Perceiver-based sequence to detect abnormalities in speech reflective of several neurological disorders. We combine this sequence with a Universal Speech Model (USM) that is trained (unsupervised) on 12 million hours of diverse audio recordings. Our model outperforms standard transformer (80.9%) and perceiver (81.8%) models and achieves an average accuracy of 83.1%.
arXiv Detail & Related papers (2023-10-16T21:07:12Z)
The role of noise in denoising models for anomaly detection in medical images [62.0532151156057]
Pathological brain lesions exhibit diverse appearance in brain images. Unsupervised anomaly detection approaches have been proposed using only normal data for training. We show that optimization of the spatial resolution and magnitude of the noise improves the performance of different model training regimes.
arXiv Detail & Related papers (2023-01-19T21:39:38Z)
A Meta-GNN approach to personalized seizure detection and classification [53.906130332172324]
We propose a personalized seizure detection and classification framework that quickly adapts to a specific patient from limited seizure samples. We train a Meta-GNN based classifier that learns a global model from a set of training patients. We show that our method outperforms the baselines by reaching 82.7% on accuracy and 82.08% on F1 score after only 20 iterations on new unseen patients.
arXiv Detail & Related papers (2022-11-01T14:12:58Z)
Side-aware Meta-Learning for Cross-Dataset Listener Diagnosis with Subjective Tinnitus [38.66127142638335]
This paper proposes a side-aware meta-learning for cross-dataset tinnitus diagnosis. Our method achieves a high accuracy of 73.8% in the cross-dataset classification.
arXiv Detail & Related papers (2022-05-03T03:17:44Z)
Multiple Time Series Fusion Based on LSTM An Application to CAP A Phase Classification Using EEG [56.155331323304]
Deep learning based electroencephalogram channels' feature level fusion is carried out in this work. Channel selection, fusion, and classification procedures were optimized by two optimization algorithms.
arXiv Detail & Related papers (2021-12-18T14:17:49Z)
Novel EEG based Schizophrenia Detection with IoMT Framework for Smart Healthcare [0.0]
Schizophrenia(Sz) is a brain disorder that severely affects the thinking, behaviour, and feelings of people all around the world. EEG is a non-linear time-seriesi signal and utilizing it for investigation is rather crucial due to its non-linear structure. This paper aims to improve the performance of EEG based Sz detection using a deep learning approach.
arXiv Detail & Related papers (2021-11-19T18:21:20Z)
Detecting COVID-19 from Breathing and Coughing Sounds using Deep Neural Networks [68.8204255655161]
We adapt an ensemble of Convolutional Neural Networks to classify if a speaker is infected with COVID-19 or not. Ultimately, it achieves an Unweighted Average Recall (UAR) of 74.9%, or an Area Under ROC Curve (AUC) of 80.7% by ensembling neural networks.
arXiv Detail & Related papers (2020-12-29T01:14:17Z)
UESegNet: Context Aware Unconstrained ROI Segmentation Networks for Ear Biometric [8.187718963808484]
ear biometrics possess a great level of difficulties in the unconstrained environment. To address the problem of ear localization in the wild, we have proposed two high-performance region of interest (ROI) segmentation models UESegNet-1 and UESegNet-2. To test the model's generalization, they are evaluated on six different benchmark datasets.
arXiv Detail & Related papers (2020-10-08T14:05:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.