Advancing Complex Medical Communication in Arabic with Sporo AraSum: Surpassing Existing Large Language Models
- URL: http://arxiv.org/abs/2411.13518v1
- Date: Wed, 20 Nov 2024 18:10:19 GMT
- Title: Advancing Complex Medical Communication in Arabic with Sporo AraSum: Surpassing Existing Large Language Models
- Authors: Chanseo Lee, Sonu Kumar, Kimon A. Vogt, Sam Meraj, Antonia Vogt,
- Abstract summary: This case study evaluates Sporo AraSum, a language model tailored for Arabic clinical documentation, against JAIS, the leading Arabic NLP model.
Results indicate that Sporo AraSum significantly outperforms JAIS in AI-centric quantitative metrics and all qualitative attributes measured in our modified version of the PDQI-9.
- Score: 0.0
- License:
- Abstract: The increasing demand for multilingual capabilities in healthcare underscores the need for AI models adept at processing diverse languages, particularly in clinical documentation and decision-making. Arabic, with its complex morphology, syntax, and diglossia, poses unique challenges for natural language processing (NLP) in medical contexts. This case study evaluates Sporo AraSum, a language model tailored for Arabic clinical documentation, against JAIS, the leading Arabic NLP model. Using synthetic datasets and modified PDQI-9 metrics modified ourselves for the purposes of assessing model performances in a different language. The study assessed the models' performance in summarizing patient-physician interactions, focusing on accuracy, comprehensiveness, clinical utility, and linguistic-cultural competence. Results indicate that Sporo AraSum significantly outperforms JAIS in AI-centric quantitative metrics and all qualitative attributes measured in our modified version of the PDQI-9. AraSum's architecture enables precise and culturally sensitive documentation, addressing the linguistic nuances of Arabic while mitigating risks of AI hallucinations. These findings suggest that Sporo AraSum is better suited to meet the demands of Arabic-speaking healthcare environments, offering a transformative solution for multilingual clinical workflows. Future research should incorporate real-world data to further validate these findings and explore broader integration into healthcare systems.
Related papers
- ELMTEX: Fine-Tuning Large Language Models for Structured Clinical Information Extraction. A Case Study on Clinical Reports [3.0363830583066713]
This paper presents the results of our project, which aims to leverage Large Language Models (LLMs) to extract structured information from unstructured clinical reports.
We developed a workflow with a user interface and evaluated LLMs of varying sizes through prompting strategies and fine-tuning.
Our results show that fine-tuned smaller models match or surpass larger counterparts in performance, offering efficiency for resource-limited settings.
arXiv Detail & Related papers (2025-02-08T16:44:56Z) - Conversation AI Dialog for Medicare powered by Finetuning and Retrieval Augmented Generation [0.0]
Large language models (LLMs) have shown impressive capabilities in natural language processing tasks, including dialogue generation.
This research aims to conduct a novel comparative analysis of two prominent techniques, fine-tuning with LoRA and the Retrieval-Augmented Generation framework.
arXiv Detail & Related papers (2025-02-04T11:50:40Z) - Bridging Language Barriers in Healthcare: A Study on Arabic LLMs [1.2006896500048552]
This paper investigates the challenges of developing large language models proficient in both multilingual understanding and medical knowledge.
We find that larger models with carefully calibrated language ratios achieve superior performance on native-language clinical tasks.
arXiv Detail & Related papers (2025-01-16T20:24:56Z) - A Comprehensive Evaluation of Large Language Models on Mental Illnesses in Arabic Context [0.9074663948713616]
Mental health disorders pose a growing public health concern in the Arab world.
This study comprehensively evaluates 8 large language models (LLMs) on diverse mental health datasets.
arXiv Detail & Related papers (2025-01-12T16:17:25Z) - LLMs-in-the-loop Part-1: Expert Small AI Models for Bio-Medical Text Translation [0.0]
This study introduces a novel "LLMs-in-the-loop" approach to develop supervised neural machine translation models optimized for medical texts.
Custom parallel corpora in six languages were compiled from scientific articles, synthetically generated clinical documents, and medical texts.
Our MarianMT-based models outperform Google Translate, DeepL, and GPT-4-Turbo.
arXiv Detail & Related papers (2024-07-16T19:32:23Z) - Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator [69.51568871044454]
We introduce textbfAI Hospital, a framework simulating dynamic medical interactions between emphDoctor as player and NPCs.
This setup allows for realistic assessments of LLMs in clinical scenarios.
We develop the Multi-View Medical Evaluation benchmark, utilizing high-quality Chinese medical records and NPCs.
arXiv Detail & Related papers (2024-02-15T06:46:48Z) - PMC-LLaMA: Towards Building Open-source Language Models for Medicine [62.39105735933138]
Large Language Models (LLMs) have showcased remarkable capabilities in natural language understanding.
LLMs struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge.
We describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA.
arXiv Detail & Related papers (2023-04-27T18:29:05Z) - CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark [51.38557174322772]
We present the first Chinese Biomedical Language Understanding Evaluation benchmark.
It is a collection of natural language understanding tasks including named entity recognition, information extraction, clinical diagnosis normalization, single-sentence/sentence-pair classification.
We report empirical results with the current 11 pre-trained Chinese models, and experimental results show that state-of-the-art neural models perform by far worse than the human ceiling.
arXiv Detail & Related papers (2021-06-15T12:25:30Z) - Benchmarking Automated Clinical Language Simplification: Dataset,
Algorithm, and Evaluation [48.87254340298189]
We construct a new dataset named MedLane to support the development and evaluation of automated clinical language simplification approaches.
We propose a new model called DECLARE that follows the human annotation procedure and achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-12-04T06:09:02Z) - Data Augmentation for Spoken Language Understanding via Pretrained
Language Models [113.56329266325902]
Training of spoken language understanding (SLU) models often faces the problem of data scarcity.
We put forward a data augmentation method using pretrained language models to boost the variability and accuracy of generated utterances.
arXiv Detail & Related papers (2020-04-29T04:07:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.