The Multicultural Medical Assistant: Can LLMs Improve Medical ASR Errors Across Borders?
- URL: http://arxiv.org/abs/2501.15310v1
- Date: Sat, 25 Jan 2025 19:40:26 GMT
- Title: The Multicultural Medical Assistant: Can LLMs Improve Medical ASR Errors Across Borders?
- Authors: Ayo Adedeji, Mardhiyah Sanni, Emmanuel Ayodele, Sarita Joshi, Tobi Olatunji,
- Abstract summary: This study investigates the prevalence and impact of ASR errors in medical transcription in Nigeria, the United Kingdom, and the United States.
We assess the potential and limitations of Large Language Models to address challenges related to accents and medical terminology in ASR.
- Score: 0.0
- License:
- Abstract: The global adoption of Large Language Models (LLMs) in healthcare shows promise to enhance clinical workflows and improve patient outcomes. However, Automatic Speech Recognition (ASR) errors in critical medical terms remain a significant challenge. These errors can compromise patient care and safety if not detected. This study investigates the prevalence and impact of ASR errors in medical transcription in Nigeria, the United Kingdom, and the United States. By evaluating raw and LLM-corrected transcriptions of accented English in these regions, we assess the potential and limitations of LLMs to address challenges related to accents and medical terminology in ASR. Our findings highlight significant disparities in ASR accuracy across regions and identify specific conditions under which LLM corrections are most effective.
Related papers
- Fact or Guesswork? Evaluating Large Language Model's Medical Knowledge with Structured One-Hop Judgment [108.55277188617035]
Large language models (LLMs) have been widely adopted in various downstream task domains, but their ability to directly recall and apply factual medical knowledge remains under-explored.
Most existing medical QA benchmarks assess complex reasoning or multi-hop inference, making it difficult to isolate LLMs' inherent medical knowledge from their reasoning capabilities.
We introduce the Medical Knowledge Judgment, a dataset specifically designed to measure LLMs' one-hop factual medical knowledge.
arXiv Detail & Related papers (2025-02-20T05:27:51Z) - LlaMADRS: Prompting Large Language Models for Interview-Based Depression Assessment [75.44934940580112]
This study introduces LlaMADRS, a novel framework leveraging open-source Large Language Models (LLMs) to automate depression severity assessment.
We employ a zero-shot prompting strategy with carefully designed cues to guide the model in interpreting and scoring transcribed clinical interviews.
Our approach, tested on 236 real-world interviews, demonstrates strong correlations with clinician assessments.
arXiv Detail & Related papers (2025-01-07T08:49:04Z) - Multi-OphthaLingua: A Multilingual Benchmark for Assessing and Debiasing LLM Ophthalmological QA in LMICs [3.1894617416005855]
Large language models (LLMs) present a promising solution to automate various ophthalmology procedures.
LLMs have demonstrated significantly varied performance across different languages in natural language question-answering tasks.
This study introduces the first multilingual ophthalmological question-answering benchmark with manually curated questions parallel across languages.
arXiv Detail & Related papers (2024-12-18T20:18:03Z) - Mitigating the Risk of Health Inequity Exacerbated by Large Language Models [5.02540629164568]
We show that incorporating non decisive sociodemographic factors into the input of large language models can lead to incorrect and harmful outputs.
We introduce EquityGuard, a novel framework designed to detect and mitigate the risk of health inequities in LLM based medical applications.
arXiv Detail & Related papers (2024-10-07T16:40:21Z) - Performant ASR Models for Medical Entities in Accented Speech [0.9346027495459037]
We rigorously evaluate multiple ASR models on a clinical English dataset of 93 African accents.
Our analysis reveals that despite some models achieving low overall word error rates (WER), errors in clinical entities are higher, potentially posing substantial risks to patient safety.
arXiv Detail & Related papers (2024-06-18T08:19:48Z) - AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator [69.51568871044454]
We introduce textbfAI Hospital, a framework simulating dynamic medical interactions between emphDoctor as player and NPCs.
This setup allows for realistic assessments of LLMs in clinical scenarios.
We develop the Multi-View Medical Evaluation benchmark, utilizing high-quality Chinese medical records and NPCs.
arXiv Detail & Related papers (2024-02-15T06:46:48Z) - The Sound of Healthcare: Improving Medical Transcription ASR Accuracy
with Large Language Models [0.0]
Large Language Models (LLMs) can enhance the accuracy of Automatic Speech Recognition (ASR) systems in medical transcription.
Our research focuses on improvements in Word Error Rate (WER), Medical Concept WER (MC-WER) for the accurate transcription of essential medical terms, and speaker diarization accuracy.
arXiv Detail & Related papers (2024-02-12T14:01:12Z) - Evaluating Large Language Models for Radiology Natural Language
Processing [68.98847776913381]
The rise of large language models (LLMs) has marked a pivotal shift in the field of natural language processing (NLP)
This study seeks to bridge this gap by critically evaluating thirty two LLMs in interpreting radiology reports.
arXiv Detail & Related papers (2023-07-25T17:57:18Z) - Self-Verification Improves Few-Shot Clinical Information Extraction [73.6905567014859]
Large language models (LLMs) have shown the potential to accelerate clinical curation via few-shot in-context learning.
They still struggle with issues regarding accuracy and interpretability, especially in mission-critical domains such as health.
Here, we explore a general mitigation framework using self-verification, which leverages the LLM to provide provenance for its own extraction and check its own outputs.
arXiv Detail & Related papers (2023-05-30T22:05:11Z) - Are Large Language Models Ready for Healthcare? A Comparative Study on
Clinical Language Understanding [12.128991867050487]
Large language models (LLMs) have made significant progress in various domains, including healthcare.
In this study, we evaluate state-of-the-art LLMs within the realm of clinical language understanding tasks.
arXiv Detail & Related papers (2023-04-09T16:31:47Z) - SPeC: A Soft Prompt-Based Calibration on Performance Variability of
Large Language Model in Clinical Notes Summarization [50.01382938451978]
We introduce a model-agnostic pipeline that employs soft prompts to diminish variance while preserving the advantages of prompt-based summarization.
Experimental findings indicate that our method not only bolsters performance but also effectively curbs variance for various language models.
arXiv Detail & Related papers (2023-03-23T04:47:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.