Related papers: An Ensemble Classification Approach in A Multi-Layered Large Language Model Framework for Disease Prediction

An Ensemble Classification Approach in A Multi-Layered Large Language Model Framework for Disease Prediction

URL: http://arxiv.org/abs/2509.02446v1
Date: Tue, 02 Sep 2025 15:53:51 GMT
Title: An Ensemble Classification Approach in A Multi-Layered Large Language Model Framework for Disease Prediction
Authors: Ali Hamdi, Malak Mohamed, Rokaia Emad, Khaled Shaban,
Abstract summary: Social telehealth has made remarkable progress in healthcare by allowing patients to post symptoms and participate in medical consultations remotely.<n>Users frequently post symptoms on social media and online health platforms, creating a huge repository of medical data.<n>Large language models (LLMs) have demonstrated strong capabilities in processing complex medical text.
Score: 0.4666493857924357
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Social telehealth has made remarkable progress in healthcare by allowing patients to post symptoms and participate in medical consultations remotely. Users frequently post symptoms on social media and online health platforms, creating a huge repository of medical data that can be leveraged for disease classification. Large language models (LLMs) such as LLAMA3 and GPT-3.5, along with transformer-based models like BERT, have demonstrated strong capabilities in processing complex medical text. In this study, we evaluate three Arabic medical text preprocessing methods such as summarization, refinement, and Named Entity Recognition (NER) before applying fine-tuned Arabic transformer models (CAMeLBERT, AraBERT, and AsafayaBERT). To enhance robustness, we adopt a majority voting ensemble that combines predictions from original and preprocessed text representations. This approach achieved the best classification accuracy of 80.56%, thus showing its effectiveness in leveraging various text representations and model predictions to improve the understanding of medical texts. To the best of our knowledge, this is the first work that integrates LLM-based preprocessing with fine-tuned Arabic transformer models and ensemble learning for disease classification in Arabic social telehealth data.

Related papers

Arabic Large Language Models for Medical Text Generation [0.5483130283061118]
This study proposes an approach that fine-tunes large language models (LLMs) for Arabic medical text generation.<n>The system is designed to assist patients by providing accurate medical advice, diagnoses, drug recommendations, and treatment plans based on user input.
arXiv Detail & Related papers (2025-09-12T09:37:26Z)
MedGemma Technical Report [75.88152277443179]
We introduce MedGemma, a collection of medical vision-language foundation models based on Gemma 3 4B and 27B.<n>MedGemma demonstrates advanced medical understanding and reasoning on images and text.<n>We additionally introduce MedSigLIP, a medically-tuned vision encoder derived from SigLIP.
arXiv Detail & Related papers (2025-07-07T17:01:44Z)
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning [57.873833577058]
We build a multimodal dataset enriched with extensive medical knowledge.<n>We then introduce our medical-specialized MLLM: Lingshu.<n>Lingshu undergoes multi-stage training to embed medical expertise and enhance its task-solving capabilities.
arXiv Detail & Related papers (2025-06-08T08:47:30Z)
Leveraging large language models and traditional machine learning ensembles for ADHD detection from narrative transcripts [6.55440666066668]
We introduce an ensemble framework for automatically classifying Attention-Deficit/Hyperactivity Disorder (ADHD) diagnosis (binary) using narrative transcripts.<n>Our approach integrates three complementary models: LLaMA3, RoBERTa, and a Support Vector Machine (SVM)<n> Empirical results show that the ensemble outperforms individual models.
arXiv Detail & Related papers (2025-05-27T15:22:01Z)
A Multi-Layered Large Language Model Framework for Disease Prediction [0.0]
Large language models (LLMs) process complex medical data to enhance disease classification.<n>This study explores three Arabic medical text preprocessing techniques.<n> evaluating CAMeL-BERT, AraBERT, and Asafaya-BERT with LoRA.
arXiv Detail & Related papers (2025-01-30T18:53:50Z)
Uncertainty-aware Medical Diagnostic Phrase Identification and Grounding [72.18719355481052]
We introduce a novel task called Medical Report Grounding (MRG)<n>MRG aims to directly identify diagnostic phrases and their corresponding grounding boxes from medical reports in an end-to-end manner.<n>We propose uMedGround, a robust and reliable framework that leverages a multimodal large language model to predict diagnostic phrases.
arXiv Detail & Related papers (2024-04-10T07:41:35Z)
FaMeSumm: Investigating and Improving Faithfulness of Medical Summarization [20.7585913214759]
Current summarization models often produce unfaithful outputs for medical input text. FaMeSumm is a framework to improve faithfulness by fine-tuning pre-trained language models based on medical knowledge.
arXiv Detail & Related papers (2023-11-03T23:25:53Z)
Customizing General-Purpose Foundation Models for Medical Report Generation [64.31265734687182]
The scarcity of labelled medical image-report pairs presents great challenges in the development of deep and large-scale neural networks. We propose customizing off-the-shelf general-purpose large-scale pre-trained models, i.e., foundation models (FMs) in computer vision and natural language processing.
arXiv Detail & Related papers (2023-06-09T03:02:36Z)
PMC-LLaMA: Towards Building Open-source Language Models for Medicine [62.39105735933138]
Large Language Models (LLMs) have showcased remarkable capabilities in natural language understanding. LLMs struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge. We describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA.
arXiv Detail & Related papers (2023-04-27T18:29:05Z)
Multi-Modal Perceiver Language Model for Outcome Prediction in Emergency Department [0.03088120935391119]
We are interested in outcome prediction and patient triage in hospital emergency department based on text information in chief complaints and vital signs recorded at triage. We adapt Perceiver - a modality-agnostic transformer-based model that has shown promising results in several applications. In the experimental analysis, we show that mutli-modality improves the prediction performance compared with models trained solely on text or vital signs.
arXiv Detail & Related papers (2023-04-03T06:32:00Z)
MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training in Radiology [40.52487429030841]
We consider enhancing medical visual-language pre-training with domain-specific knowledge, by exploiting the paired image-text reports from the radiological daily practice. First, unlike existing works that directly process the raw reports, we adopt a novel triplet extraction module to extract the medical-related information. Second, we propose a novel triplet encoding module with entity translation by querying a knowledge base, to exploit the rich domain knowledge in medical field. Third, we propose to use a Transformer-based fusion model for spatially aligning the entity description with visual signals at the image patch level, enabling the ability for medical diagnosis
arXiv Detail & Related papers (2023-01-05T18:55:09Z)
Predicting Clinical Diagnosis from Patients Electronic Health Records Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community. We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence. We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.