Related papers: From Staff Messages to Actionable Insights: A Multi-Stage LLM Classification Framework for Healthcare Analytics

From Staff Messages to Actionable Insights: A Multi-Stage LLM Classification Framework for Healthcare Analytics

URL: http://arxiv.org/abs/2509.05484v1
Date: Fri, 05 Sep 2025 20:15:52 GMT
Title: From Staff Messages to Actionable Insights: A Multi-Stage LLM Classification Framework for Healthcare Analytics
Authors: Hajar Sakai, Yi-En Tseng, Mohammadsadegh Mikaeili, Joshua Bosire, Franziska Jovin,
Abstract summary: This paper presents a framework that identifies staff message topics and classifies messages by their reasons in a multi-class fashion.<n>The best-performing model was o3, achieving 78.4% weighted F1-score and 79.2% accuracy.<n>The proposed methodology incorporates data security measures and HIPAA compliance requirements essential for healthcare environments.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Hospital call centers serve as the primary contact point for patients within a hospital system. They also generate substantial volumes of staff messages as navigators process patient requests and communicate with the hospital offices following the established protocol restrictions and guidelines. This continuously accumulated large amount of text data can be mined and processed to retrieve insights; however, traditional supervised learning approaches require annotated data, extensive training, and model tuning. Large Language Models (LLMs) offer a paradigm shift toward more computationally efficient methodologies for healthcare analytics. This paper presents a multi-stage LLM-based framework that identifies staff message topics and classifies messages by their reasons in a multi-class fashion. In the process, multiple LLM types, including reasoning, general-purpose, and lightweight models, were evaluated. The best-performing model was o3, achieving 78.4% weighted F1-score and 79.2% accuracy, followed closely by gpt-5 (75.3% Weighted F1-score and 76.2% accuracy). The proposed methodology incorporates data security measures and HIPAA compliance requirements essential for healthcare environments. The processed LLM outputs are integrated into a visualization decision support tool that transforms the staff messages into actionable insights accessible to healthcare professionals. This approach enables more efficient utilization of the collected staff messaging data, identifies navigator training opportunities, and supports improved patient experience and care quality.

Related papers

A Federated and Parameter-Efficient Framework for Large Language Model Training in Medicine [59.78991974851707]
Large language models (LLMs) have demonstrated strong performance on medical benchmarks, including question answering and diagnosis.<n>Most medical LLMs are trained on data from a single institution, which faces limitations in generalizability and safety in heterogeneous systems.<n>We introduce the model-agnostic and parameter-efficient federated learning framework for adapting LLMs to medical applications.
arXiv Detail & Related papers (2026-01-29T18:48:21Z)
WiNGPT-3.0 Technical Report [8.679917766554723]
Current Large Language Models (LLMs) exhibit significant limitations, notably in structured, interpretable, and verifiable medical reasoning.<n>This report focuses on the development of WiNGPT-3.0, the 32-billion parameter LLMs, engineered with the objective of enhancing its capacity for medical reasoning.
arXiv Detail & Related papers (2025-05-23T01:53:04Z)
Performance of Large Language Models in Supporting Medical Diagnosis and Treatment [0.0]
AI-driven systems can analyze vast datasets, assisting clinicians in identifying diseases, recommending treatments, and predicting patient outcomes.<n>This study evaluates the performance of a range of contemporary LLMs, including both open-source and closed-source models, on the 2024 Portuguese National Exam for medical specialty access.
arXiv Detail & Related papers (2025-04-14T16:53:59Z)
Structured Outputs Enable General-Purpose LLMs to be Medical Experts [50.02627258858336]
Large language models (LLMs) often struggle with open-ended medical questions.<n>We propose a novel approach utilizing structured medical reasoning.<n>Our approach achieves the highest Factuality Score of 85.8, surpassing fine-tuned models.
arXiv Detail & Related papers (2025-03-05T05:24:55Z)
Demystifying Large Language Models for Medicine: A Primer [50.83806796466396]
Large language models (LLMs) represent a transformative class of AI tools capable of revolutionizing various aspects of healthcare. This tutorial aims to equip healthcare professionals with the tools necessary to effectively integrate LLMs into clinical practice.
arXiv Detail & Related papers (2024-10-24T15:41:56Z)
Developing Healthcare Language Model Embedding Spaces [0.20971479389679337]
Pre-trained Large Language Models (LLMs) often struggle on out-of-domain datasets like healthcare focused text. Three methods are assessed: traditional masked language modeling, Deep Contrastive Learning for Unsupervised Textual Representations (DeCLUTR) and a novel pre-training objective utilizing metadata categories from the healthcare settings. Contrastively trained models outperform other approaches on the classification tasks, delivering strong performance from limited labeled data and with fewer model parameter updates required.
arXiv Detail & Related papers (2024-03-28T19:31:32Z)
LLMs Accelerate Annotation for Medical Information Extraction [7.743388571513413]
We propose an approach that combines Large Language Models (LLMs) with human expertise to create an efficient method for generating ground truth labels for medical text annotation. We rigorously evaluate our method on a medical information extraction task, demonstrating that our approach not only substantially cuts down on human intervention but also maintains high accuracy.
arXiv Detail & Related papers (2023-12-04T19:26:13Z)
Adapted Large Language Models Can Outperform Medical Experts in Clinical Text Summarization [8.456700096020601]
Large language models (LLMs) have shown promise in natural language processing (NLP), but their effectiveness on a diverse range of clinical summarization tasks remains unproven. In this study, we apply adaptation methods to eight LLMs, spanning four distinct clinical summarization tasks. A clinical reader study with ten physicians evaluates summary, completeness, correctness, and conciseness; in a majority of cases, summaries from our best adapted LLMs are either equivalent (45%) or superior (36%) compared to summaries from medical experts.
arXiv Detail & Related papers (2023-09-14T05:15:01Z)
MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records [60.35217378132709]
Large language models (LLMs) can follow natural language instructions with human-level fluency. evaluating LLMs on realistic text generation tasks for healthcare remains challenging. We introduce MedAlign, a benchmark dataset of 983 natural language instructions for EHR data.
arXiv Detail & Related papers (2023-08-27T12:24:39Z)
Large Language Models for Healthcare Data Augmentation: An Example on Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM) Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z)
Active learning for medical code assignment [55.99831806138029]
We demonstrate the effectiveness of Active Learning (AL) in multi-label text classification in the clinical domain. We apply a set of well-known AL methods to help automatically assign ICD-9 codes on the MIMIC-III dataset. Our results show that the selection of informative instances provides satisfactory classification with a significantly reduced training set.
arXiv Detail & Related papers (2021-04-12T18:11:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.