HealthSLM-Bench: Benchmarking Small Language Models for Mobile and Wearable Healthcare Monitoring
- URL: http://arxiv.org/abs/2509.07260v4
- Date: Tue, 30 Sep 2025 12:04:14 GMT
- Title: HealthSLM-Bench: Benchmarking Small Language Models for Mobile and Wearable Healthcare Monitoring
- Authors: Xin Wang, Ting Dang, Xinyu Zhang, Vassilis Kostakos, Michael J. Witbrock, Hong Jia,
- Abstract summary: Small Language Models (SLMs) are lightweight and designed to run locally and efficiently on mobile and wearable devices.<n>We evaluate SLMs on health prediction tasks using zero-shot, few-shot, and instruction fine-tuning approaches.<n>Our results show that SLMs can achieve performance comparable to large language models while offering substantial gains in efficiency and privacy.
- Score: 18.403597949089317
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mobile and wearable healthcare monitoring play a vital role in facilitating timely interventions, managing chronic health conditions, and ultimately improving individuals' quality of life. Previous studies on large language models (LLMs) have highlighted their impressive generalization abilities and effectiveness in healthcare prediction tasks. However, most LLM-based healthcare solutions are cloud-based, which raises significant privacy concerns and results in increased memory usage and latency. To address these challenges, there is growing interest in compact models, Small Language Models (SLMs), which are lightweight and designed to run locally and efficiently on mobile and wearable devices. Nevertheless, how well these models perform in healthcare prediction remains largely unexplored. We systematically evaluated SLMs on health prediction tasks using zero-shot, few-shot, and instruction fine-tuning approaches, and deployed the best performing fine-tuned SLMs on mobile devices to evaluate their real-world efficiency and predictive performance in practical healthcare scenarios. Our results show that SLMs can achieve performance comparable to LLMs while offering substantial gains in efficiency and privacy. However, challenges remain, particularly in handling class imbalance and few-shot scenarios. These findings highlight SLMs, though imperfect in their current form, as a promising solution for next-generation, privacy-preserving healthcare monitoring.
Related papers
- Medicine on the Edge: Comparative Performance Analysis of On-Device LLMs for Clinical Reasoning [1.6010529993238123]
We benchmark publicly available on-device Large Language Models (LLM) using the AMEGA dataset.<n>Our results indicate that compact general-purpose models like Phi-3 Mini achieve a strong balance between speed and accuracy.<n>We emphasize the need for more efficient inference and models tailored to real-world clinical reasoning.
arXiv Detail & Related papers (2025-02-13T04:35:55Z) - Benchmarking LLMs and SLMs for patient reported outcomes [0.0]
This study benchmarks several SLMs against LLMs for summarizing patient-reported Q&A forms in the context of radiotherapy.<n>Using various metrics, we evaluate their precision and reliability.<n>The findings highlight both the promise and limitations of SLMs for high-stakes medical tasks, fostering more efficient and privacy-preserving AI-driven healthcare solutions.
arXiv Detail & Related papers (2024-12-20T19:01:25Z) - Unveiling Performance Challenges of Large Language Models in Low-Resource Healthcare: A Demographic Fairness Perspective [7.1047384702030625]
We evaluate state-of-the-art large language models (LLMs) with three prevalent learning frameworks across six diverse healthcare tasks.<n>We find significant challenges in applying LLMs to real-world healthcare tasks and persistent fairness issues across demographic groups.
arXiv Detail & Related papers (2024-11-30T18:52:30Z) - Efficient and Personalized Mobile Health Event Prediction via Small Language Models [14.032049217103024]
Small Language Models (SLMs) are potential candidates to solve privacy and computational issues.
This paper examines the capability of SLMs to accurately analyze health data, such as steps, calories, sleep minutes, and other vital statistics.
Our results indicate that SLMs could potentially be deployed on wearable or mobile devices for real-time health monitoring.
arXiv Detail & Related papers (2024-09-17T01:57:57Z) - Large Language Model Distilling Medication Recommendation Model [58.94186280631342]
We harness the powerful semantic comprehension and input-agnostic characteristics of Large Language Models (LLMs)<n>Our research aims to transform existing medication recommendation methodologies using LLMs.<n>To mitigate this, we have developed a feature-level knowledge distillation technique, which transfers the LLM's proficiency to a more compact model.
arXiv Detail & Related papers (2024-02-05T08:25:22Z) - Deep Reinforcement Learning Empowered Activity-Aware Dynamic Health
Monitoring Systems [69.41229290253605]
Existing monitoring approaches were designed on the premise that medical devices track several health metrics concurrently.
This means that they report all relevant health values within that scope, which can result in excess resource use and the gathering of extraneous data.
We propose Dynamic Activity-Aware Health Monitoring strategy (DActAHM) for striking a balance between optimal monitoring performance and cost efficiency.
arXiv Detail & Related papers (2024-01-19T16:26:35Z) - Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning [67.0609518552321]
We propose to conduct Machine Vision Therapy which aims to rectify the noisy predictions from vision models.
By fine-tuning with the denoised labels, the learning model performance can be boosted in an unsupervised manner.
arXiv Detail & Related papers (2023-12-05T07:29:14Z) - Redefining Digital Health Interfaces with Large Language Models [69.02059202720073]
Large Language Models (LLMs) have emerged as general-purpose models with the ability to process complex information.
We show how LLMs can provide a novel interface between clinicians and digital technologies.
We develop a new prognostic tool using automated machine learning.
arXiv Detail & Related papers (2023-10-05T14:18:40Z) - Enhancing Small Medical Learners with Privacy-preserving Contextual Prompting [24.201549275369487]
We present a method that harnesses large language models' medical expertise to boost SLM performance in medical tasks under privacy-restricted scenarios.
Specifically, we mitigate patient privacy issues by extracting keywords from medical data and prompting the LLM to generate a medical knowledge-intensive context.
Our method significantly enhances performance in both few-shot and full training settings across three medical knowledge-intensive tasks.
arXiv Detail & Related papers (2023-05-22T05:14:38Z) - Privacy-preserving machine learning for healthcare: open challenges and
future perspectives [72.43506759789861]
We conduct a review of recent literature concerning Privacy-Preserving Machine Learning (PPML) for healthcare.
We primarily focus on privacy-preserving training and inference-as-a-service.
The aim of this review is to guide the development of private and efficient ML models in healthcare.
arXiv Detail & Related papers (2023-03-27T19:20:51Z) - Large Language Models for Healthcare Data Augmentation: An Example on
Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM)
Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.