MMDS: A Multimodal Medical Diagnosis System Integrating Image Analysis and Knowledge-based Departmental Consultation
- URL: http://arxiv.org/abs/2410.15403v2
- Date: Mon, 25 Nov 2024 12:48:23 GMT
- Title: MMDS: A Multimodal Medical Diagnosis System Integrating Image Analysis and Knowledge-based Departmental Consultation
- Authors: Yi Ren, HanZhi Zhang, Weibin Li, Jun Fu, Diandong Liu, Tianyi Zhang, Jie He, Licheng Jiao,
- Abstract summary: MMDS is a system capable of recognizing medical images and patient facial details.
The first component is the analysis of medical images and videos.
The second component is the generation of professional medical responses.
- Score: 44.16528320070089
- License:
- Abstract: We present MMDS, a system capable of recognizing medical images and patient facial details, and providing professional medical diagnoses. The system consists of two core components:The first component is the analysis of medical images and videos. We trained a specialized multimodal medical model capable of interpreting medical images and accurately analyzing patients' facial emotions and facial paralysis conditions. The model achieved an accuracy of 72.59% on the FER2013 facial emotion recognition dataset, with a 91.1% accuracy in recognizing the "happy" emotion. In facial paralysis recognition, the model reached an accuracy of 92%, which is 30% higher than that of GPT-4o. Based on this model, we developed a parser for analyzing facial movement videos of patients with facial paralysis, achieving precise grading of the paralysis severity. In tests on 30 videos of facial paralysis patients, the system demonstrated a grading accuracy of 83.3%.The second component is the generation of professional medical responses. We employed a large language model, integrated with a medical knowledge base, to generate professional diagnoses based on the analysis of medical images or videos. The core innovation lies in our development of a department-specific knowledge base routing management mechanism, in which the large language model categorizes data by medical departments and, during the retrieval process, determines the appropriate knowledge base to query. This significantly improves retrieval accuracy in the RAG (retrieval-augmented generation) process.
Related papers
- Deep Learning Applications in Medical Image Analysis: Advancements, Challenges, and Future Directions [0.0]
Recent breakthroughs in deep learning, a subset of artificial intelligence, have markedly revolutionized the analysis of medical pictures.
CNNs have demonstrated remarkable proficiency in autonomously learning features from multidimensional medical pictures.
These models have been utilized across multiple medical disciplines, including pathology, radiology, ophthalmology, and cardiology.
arXiv Detail & Related papers (2024-10-18T02:57:14Z) - Liver Cancer Knowledge Graph Construction based on dynamic entity replacement and masking strategies RoBERTa-BiLSTM-CRF model [12.467967838229452]
Liver cancer ranks as the fifth most common malignant tumor and the second most fatal in our country.
Early diagnosis is crucial, necessitating that physicians identify liver cancer in patients at the earliest possible stage.
arXiv Detail & Related papers (2024-10-08T07:57:29Z) - Automated facial recognition system using deep learning for pain
assessment in adults with cerebral palsy [0.5242869847419834]
Existing measures, relying on direct observation by caregivers, lack sensitivity and specificity.
Ten neural networks were trained on three pain image databases.
InceptionV3 exhibited promising performance on the CP-PAIN dataset.
arXiv Detail & Related papers (2024-01-22T17:55:16Z) - A Transformer-based representation-learning model with unified
processing of multimodal input for clinical diagnostics [63.106382317917344]
We report a Transformer-based representation-learning model as a clinical diagnostic aid that processes multimodal input in a unified manner.
The unified model outperformed an image-only model and non-unified multimodal diagnosis models in the identification of pulmonary diseases.
arXiv Detail & Related papers (2023-06-01T16:23:47Z) - Unlocking the Potential of Medical Imaging with ChatGPT's Intelligent
Diagnostics [2.8484009470171943]
This article aims to design a decision support system to assist healthcare providers and patients in making decisions about diagnosing, treating, and managing health conditions.
The proposed architecture contains three stages: 1) data collection and labeling, 2) model training, and 3) diagnosis report generation.
The proposed system has the potential to enhance decision-making, reduce costs, and improve the capabilities of healthcare providers.
arXiv Detail & Related papers (2023-05-12T12:52:14Z) - Segment Anything in Medical Images [21.43661408153244]
We present MedSAM, a foundation model designed for enabling universal medical image segmentation.
The model is developed on a large-scale medical image dataset with 1,570,263 image-mask pairs, covering 10 imaging modalities and over 30 cancer types.
arXiv Detail & Related papers (2023-04-24T17:56:12Z) - Automated SSIM Regression for Detection and Quantification of Motion
Artefacts in Brain MR Images [54.739076152240024]
Motion artefacts in magnetic resonance brain images are a crucial issue.
The assessment of MR image quality is fundamental before proceeding with the clinical diagnosis.
An automated image quality assessment based on the structural similarity index (SSIM) regression has been proposed here.
arXiv Detail & Related papers (2022-06-14T10:16:54Z) - MIMO: Mutual Integration of Patient Journey and Medical Ontology for
Healthcare Representation Learning [49.57261599776167]
We propose an end-to-end robust Transformer-based solution, Mutual Integration of patient journey and Medical Ontology (MIMO) for healthcare representation learning and predictive analytics.
arXiv Detail & Related papers (2021-07-20T07:04:52Z) - Variational Knowledge Distillation for Disease Classification in Chest
X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays.
We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z) - An Interpretable Multiple-Instance Approach for the Detection of
referable Diabetic Retinopathy from Fundus Images [72.94446225783697]
We propose a machine learning system for the detection of referable Diabetic Retinopathy in fundus images.
By extracting local information from image patches and combining it efficiently through an attention mechanism, our system is able to achieve high classification accuracy.
We evaluate our approach on publicly available retinal image datasets, in which it exhibits near state-of-the-art performance.
arXiv Detail & Related papers (2021-03-02T13:14:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.