Related papers: Participatory Assessment of Large Language Model Applications in an Academic Medical Center

Participatory Assessment of Large Language Model Applications in an Academic Medical Center

URL: http://arxiv.org/abs/2501.10366v1
Date: Mon, 09 Dec 2024 21:45:35 GMT
Title: Participatory Assessment of Large Language Model Applications in an Academic Medical Center
Authors: Giorgia Carra, Bogdan Kulynych, François Bastardot, Daniel E. Kaufmann, Noémie Boillat-Blanco, Jean Louis Raisaro,
Abstract summary: Large Language Models (LLMs) have shown promising performance in healthcare-related applications.<n>Their deployment in the medical domain poses unique challenges of ethical, regulatory, and technical nature.
Score: 1.244412242301951
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although Large Language Models (LLMs) have shown promising performance in healthcare-related applications, their deployment in the medical domain poses unique challenges of ethical, regulatory, and technical nature. In this study, we employ a systematic participatory approach to investigate the needs and expectations regarding clinical applications of LLMs at Lausanne University Hospital, an academic medical center in Switzerland. Having identified potential LLM use-cases in collaboration with thirty stakeholders, including clinical staff across 11 departments as well nursing and patient representatives, we assess the current feasibility of these use-cases taking into account the regulatory frameworks, data protection regulation, bias, hallucinations, and deployment constraints. This study provides a framework for a participatory approach to identifying institutional needs with respect to introducing advanced technologies into healthcare practice, and a realistic analysis of the technology readiness level of LLMs for medical applications, highlighting the issues that would need to be overcome LLMs in healthcare to be ethical, and regulatory compliant.

Related papers

Medical Red Teaming Protocol of Language Models: On the Importance of User Perspectives in Healthcare Settings [51.73411055162861]
We introduce a safety evaluation protocol tailored to the medical domain in both patient user and clinician user perspectives.<n>This is the first work to define safety evaluation criteria for medical LLMs through targeted red-teaming taking three different points of view.
arXiv Detail & Related papers (2025-07-09T19:38:58Z)
Med-CoDE: Medical Critique based Disagreement Evaluation Framework [72.42301910238861]
The reliability and accuracy of large language models (LLMs) in medical contexts remain critical concerns. Current evaluation methods often lack robustness and fail to provide a comprehensive assessment of LLM performance. We propose Med-CoDE, a specifically designed evaluation framework for medical LLMs to address these challenges.
arXiv Detail & Related papers (2025-04-21T16:51:11Z)
A Survey of LLM-based Agents in Medicine: How far are we from Baymax? [44.97640611811786]
Large Language Models (LLMs) are transforming healthcare through the development of LLM-based agents. This survey provides a comprehensive review of LLM-based agents in medicine. We analyze the key components of medical agent systems, including system profiles, clinical planning mechanisms, medical reasoning frameworks, and external capacity enhancement.
arXiv Detail & Related papers (2025-02-16T17:21:05Z)
Large Language Models in Healthcare [4.119811542729794]
Large language models (LLMs) hold promise for transforming healthcare.<n>Their successful integration requires rigorous development, adaptation, and evaluation strategies tailored to clinical needs.
arXiv Detail & Related papers (2025-02-06T20:53:33Z)
Demystifying Large Language Models for Medicine: A Primer [50.83806796466396]
Large language models (LLMs) represent a transformative class of AI tools capable of revolutionizing various aspects of healthcare. This tutorial aims to equip healthcare professionals with the tools necessary to effectively integrate LLMs into clinical practice.
arXiv Detail & Related papers (2024-10-24T15:41:56Z)
CliMedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models in Clinical Scenarios [50.032101237019205]
CliMedBench is a comprehensive benchmark with 14 expert-guided core clinical scenarios. The reliability of this benchmark has been confirmed in several ways.
arXiv Detail & Related papers (2024-10-04T15:15:36Z)
Evaluating large language models in medical applications: a survey [1.5923327069574245]
Large language models (LLMs) have emerged as powerful tools with transformative potential across numerous domains. evaluating the performance of LLMs in medical contexts presents unique challenges due to the complex and critical nature of medical information.
arXiv Detail & Related papers (2024-05-13T05:08:33Z)
Does Biomedical Training Lead to Better Medical Performance? [2.3814275542331385]
Large Language Models (LLMs) are expected to significantly contribute to patient care, diagnostics, and administrative processes. This study investigates the effect of biomedical training in the context of six practical medical tasks evaluating $25$ models.
arXiv Detail & Related papers (2024-04-05T12:51:37Z)
Guiding Clinical Reasoning with Large Language Models via Knowledge Seeds [32.99251005719732]
Clinical reasoning refers to the cognitive process that physicians employ in evaluating and managing patients. In this study, we introduce a novel framework, In-Context Padding (ICP), designed to enhance LLMs with medical knowledge.
arXiv Detail & Related papers (2024-03-11T10:53:20Z)
AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator [69.51568871044454]
We introduce textbfAI Hospital, a framework simulating dynamic medical interactions between emphDoctor as player and NPCs. This setup allows for realistic assessments of LLMs in clinical scenarios. We develop the Multi-View Medical Evaluation benchmark, utilizing high-quality Chinese medical records and NPCs.
arXiv Detail & Related papers (2024-02-15T06:46:48Z)
MedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models [56.36916128631784]
We introduce MedBench, a comprehensive benchmark for the Chinese medical domain. This benchmark is composed of four key components: the Chinese Medical Licensing Examination, the Resident Standardization Training Examination, and real-world clinic cases. We perform extensive experiments and conduct an in-depth analysis from diverse perspectives, which culminate in the following findings.
arXiv Detail & Related papers (2023-12-20T07:01:49Z)
Large Language Models Illuminate a Progressive Pathway to Artificial Healthcare Assistant: A Review [16.008511195589925]
Large language models (LLMs) have shown promising capabilities in mimicking human-level language comprehension and reasoning. This paper provides a comprehensive review on the applications and implications of LLMs in medicine.
arXiv Detail & Related papers (2023-11-03T13:51:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.