LENS: LLM-Enabled Narrative Synthesis for Mental Health by Aligning Multimodal Sensing with Language Models
- URL: http://arxiv.org/abs/2512.23025v1
- Date: Sun, 28 Dec 2025 18:00:57 GMT
- Title: LENS: LLM-Enabled Narrative Synthesis for Mental Health by Aligning Multimodal Sensing with Language Models
- Authors: Wenxuan Xu, Arvind Pillai, Subigya Nepal, Amanda C Collins, Daniel M Mackin, Michael V Heinz, Tess Z Griffin, Nicholas C Jacobson, Andrew Campbell,
- Abstract summary: We introduce LENS, a framework that aligns multimodal sensing data with language models to generate mental-health narratives.<n>LENS constructs a large-scale dataset by transforming responses related to depression and anxiety symptoms into natural-language descriptions.<n>Our results show that LENS outperforms strong baselines on standard NLP metrics and task-specific measures of symptom-severity accuracy.
- Score: 5.041844772782674
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multimodal health sensing offers rich behavioral signals for assessing mental health, yet translating these numerical time-series measurements into natural language remains challenging. Current LLMs cannot natively ingest long-duration sensor streams, and paired sensor-text datasets are scarce. To address these challenges, we introduce LENS, a framework that aligns multimodal sensing data with language models to generate clinically grounded mental-health narratives. LENS first constructs a large-scale dataset by transforming Ecological Momentary Assessment (EMA) responses related to depression and anxiety symptoms into natural-language descriptions, yielding over 100,000 sensor-text QA pairs from 258 participants. To enable native time-series integration, we train a patch-level encoder that projects raw sensor signals directly into an LLM's representation space. Our results show that LENS outperforms strong baselines on standard NLP metrics and task-specific measures of symptom-severity accuracy. A user study with 13 mental-health professionals further indicates that LENS-produced narratives are comprehensive and clinically meaningful. Ultimately, our approach advances LLMs as interfaces for health sensing, providing a scalable path toward models that can reason over raw behavioral signals and support downstream clinical decision-making.
Related papers
- E^2-LLM: Bridging Neural Signals and Interpretable Affective Analysis [54.763420895859035]
We present ELLM2-EEG-to-Emotion Large Language Model, first MLLM framework for interpretable emotion analysis from EEG.<n>ELLM integrates a pretrained EEG encoder with Q-based LLMs through learnable projection layers, employing a multi-stage training pipeline.<n>Experiments on the dataset across seven emotion categories demonstrate that ELLM2-EEG-to-Emotion Large Language Model achieves excellent performance on emotion classification.
arXiv Detail & Related papers (2026-01-11T13:21:20Z) - Visualizing token importance for black-box language models [48.747801442240565]
We consider the problem of auditing black-box large language models (LLMs) to ensure they behave reliably when deployed in production settings.<n>We propose Distribution-Based Sensitivity Analysis (DBSA) to evaluate the sensitivity of the output of a language model for each input token.
arXiv Detail & Related papers (2025-12-12T14:01:43Z) - Exploring LLM-based Frameworks for Fault Diagnosis [2.2562573557834686]
Large Language Model (LLM)-based systems present new opportunities for autonomous health monitoring in sensor-rich industrial environments.<n>This study explores the potential of LLMs to detect and classify faults directly from sensor data, while producing inherently explainable outputs through natural language reasoning.
arXiv Detail & Related papers (2025-09-27T04:53:15Z) - SensorLM: Learning the Language of Wearable Sensors [50.95988682423808]
We present SensorLM, a family of sensor-language foundation models that enable wearable sensor data understanding with natural language.<n>We introduce a hierarchical caption generation pipeline designed to capture statistical, structural, and semantic information from sensor data.<n>This approach enabled the curation of the largest sensor-language dataset to date, comprising over 59.7 million hours of data from more than 103,000 people.
arXiv Detail & Related papers (2025-06-10T17:13:09Z) - PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing [49.243031514520794]
Large Language Models (LLMs) excel at capturing long-range signals due to their text-centric design.<n>PhysLLM achieves state-the-art accuracy and robustness, demonstrating superior generalization across lighting variations and motion scenarios.
arXiv Detail & Related papers (2025-05-06T15:18:38Z) - MuRAL: A Multi-Resident Ambient Sensor Dataset Annotated with Natural Language for Activities of Daily Living [4.187145402358247]
We introduce MuRAL, the first Multi-Resident Ambient sensor dataset with natural Language.<n>MuRAL is annotated with fine-grained natural language descriptions, resident identities, and high-level activity labels.<n>We benchmark MuRAL using state-of-the-art LLMs for three core tasks: subject assignment, action description, and activity classification.
arXiv Detail & Related papers (2025-04-29T07:46:14Z) - Scaling Wearable Foundation Models [54.93979158708164]
We investigate the scaling properties of sensor foundation models across compute, data, and model size.
Using a dataset of up to 40 million hours of in-situ heart rate, heart rate variability, electrodermal activity, accelerometer, skin temperature, and altimeter per-minute data from over 165,000 people, we create LSM.
Our results establish the scaling laws of LSM for tasks such as imputation, extrapolation, both across time and sensor modalities.
arXiv Detail & Related papers (2024-10-17T15:08:21Z) - SensorLLM: Aligning Large Language Models with Motion Sensors for Human Activity Recognition [13.796942110105313]
We introduce SensorLLM, a framework that enables Large Language Models (LLMs) to perform human activity recognition (HAR) from sensor time-series data.<n> SensorLLM addresses limitations through a Sensor-Language Alignment stage, where the model aligns sensor inputs with trend descriptions.<n>In the subsequent Task-Aware Tuning stage, we refine the model for HAR classification, achieving performance that matches or surpasses state-of-the-art methods.
arXiv Detail & Related papers (2024-10-14T15:30:41Z) - LLMSense: Harnessing LLMs for High-level Reasoning Over Spatiotemporal Sensor Traces [1.1137304094345333]
We design an effective prompting framework for Large Language Models (LLMs) on high-level reasoning tasks.
We also design two strategies to enhance performance with long sensor traces, including summarization before reasoning and selective inclusion of historical traces.
Our framework can be implemented in an edge-cloud setup, running small LLMs on the edge for data summarization and performing high-level reasoning on the cloud for privacy preservation.
arXiv Detail & Related papers (2024-03-28T22:06:04Z) - MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning [63.80739044622555]
We introduce MuSR, a dataset for evaluating language models on soft reasoning tasks specified in a natural language narrative.
This dataset has two crucial features. First, it is created through a novel neurosymbolic synthetic-to-natural generation algorithm.
Second, our dataset instances are free text narratives corresponding to real-world domains of reasoning.
arXiv Detail & Related papers (2023-10-24T17:59:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.