PersonaDrift: A Benchmark for Temporal Anomaly Detection in Language-Based Dementia Monitoring
- URL: http://arxiv.org/abs/2511.16445v1
- Date: Thu, 20 Nov 2025 15:15:00 GMT
- Title: PersonaDrift: A Benchmark for Temporal Anomaly Detection in Language-Based Dementia Monitoring
- Authors: Joy Lai, Alex Mihailidis,
- Abstract summary: PersonaDrift is a benchmark designed to evaluate machine learning and statistical methods for detecting progressive changes in daily communication.<n>The benchmark focuses on two forms of longitudinal change that caregivers highlighted as particularly salient: flattened sentiment and off-topic replies.<n>Preliminary results show that flattened sentiment can often be detected with simple statistical models in users with low baseline variability.
- Score: 0.9668407688201359
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: People living with dementia (PLwD) often show gradual shifts in how they communicate, becoming less expressive, more repetitive, or drifting off-topic in subtle ways. While caregivers may notice these changes informally, most computational tools are not designed to track such behavioral drift over time. This paper introduces PersonaDrift, a synthetic benchmark designed to evaluate machine learning and statistical methods for detecting progressive changes in daily communication, focusing on user responses to a digital reminder system. PersonaDrift simulates 60-day interaction logs for synthetic users modeled after real PLwD, based on interviews with caregivers. These caregiver-informed personas vary in tone, modality, and communication habits, enabling realistic diversity in behavior. The benchmark focuses on two forms of longitudinal change that caregivers highlighted as particularly salient: flattened sentiment (reduced emotional tone and verbosity) and off-topic replies (semantic drift). These changes are injected progressively at different rates to emulate naturalistic cognitive trajectories, and the framework is designed to be extensible to additional behaviors in future use cases. To explore this novel application space, we evaluate several anomaly detection approaches, unsupervised statistical methods (CUSUM, EWMA, One-Class SVM), sequence models using contextual embeddings (GRU + BERT), and supervised classifiers in both generalized and personalized settings. Preliminary results show that flattened sentiment can often be detected with simple statistical models in users with low baseline variability, while detecting semantic drift requires temporal modeling and personalized baselines. Across both tasks, personalized classifiers consistently outperform generalized ones, highlighting the importance of individual behavioral context.
Related papers
- When Personalization Tricks Detectors: The Feature-Inversion Trap in Machine-Generated Text Detection [64.23509202768945]
We introduce dataset, the first benchmark for evaluating detector robustness in personalized settings.<n>Our experimental results demonstrate large performance gaps across detectors in personalized settings.<n>We propose method, a simple and reliable way to predict detector performance changes in personalized settings.
arXiv Detail & Related papers (2025-10-14T13:10:23Z) - Unobtrusive In-Situ Measurement of Behavior Change by Deep Metric Similarity Learning of Motion Patterns [25.896192890947216]
This paper introduces an in-situ measurement method to detect user behavior changes during arbitrary exposures in XR systems.<n>We present a biometric user model based on deep metric similarity learning, which uses high-dimensional embeddings as reference vectors.
arXiv Detail & Related papers (2025-09-04T12:46:18Z) - The Personality Illusion: Revealing Dissociation Between Self-Reports & Behavior in LLMs [60.15472325639723]
Personality traits have long been studied as predictors of human behavior.<n>Recent advances in Large Language Models (LLMs) suggest similar patterns may emerge in artificial systems.
arXiv Detail & Related papers (2025-09-03T21:27:10Z) - Personalized Counterfactual Framework: Generating Potential Outcomes from Wearable Data [1.7396556690675233]
This paper introduces a framework to learn personalized counterfactual models from wearable data.<n>We first augment individual datasets with data from similar patients via multi-modal similarity analysis.<n>We then use a temporal PC (Peter-Clark) algorithm adaptation to discover predictive relationships.<n> Gradient Boosting Machines are trained on these relationships to quantify individual-specific effects.
arXiv Detail & Related papers (2025-08-20T05:04:17Z) - Two-Stage Representation Learning for Analyzing Movement Behavior Dynamics in People Living with Dementia [44.39545678576284]
This study analyzes home activity data from individuals living with dementia by proposing a two-stage, self-supervised learning approach.<n>The first stage converts time-series activities into text sequences encoded by a pre-trained language model.<n>This PageRank vector captures latent state transitions, effectively compressing complex behaviour data into a succinct form.
arXiv Detail & Related papers (2025-02-13T10:57:25Z) - Modeling Attention during Dimensional Shifts with Counterfactual and Delayed Feedback [0.4915744683251151]
We compare two methods for modeling how humans attend to specific features of decision making tasks.<n>We find that calculating an information theoretic metric over a history of experiences is best able to account for human-like behavior.
arXiv Detail & Related papers (2025-01-19T20:26:34Z) - Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance.
Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z) - Modeling User Preferences via Brain-Computer Interfacing [54.3727087164445]
We use Brain-Computer Interfacing technology to infer users' preferences, their attentional correlates towards visual content, and their associations with affective experience.
We link these to relevant applications, such as information retrieval, personalized steering of generative models, and crowdsourcing population estimates of affective experiences.
arXiv Detail & Related papers (2024-05-15T20:41:46Z) - A Novel Loss Function Utilizing Wasserstein Distance to Reduce
Subject-Dependent Noise for Generalizable Models in Affective Computing [0.4818210066519976]
Emotions are an essential part of human behavior that can impact thinking, decision-making, and communication skills.
The ability to accurately monitor and identify emotions can be useful in many human-centered applications such as behavioral training, tracking emotional well-being, and development of human-computer interfaces.
arXiv Detail & Related papers (2023-08-17T01:15:26Z) - Bring Your Own Data! Self-Supervised Evaluation for Large Language
Models [52.15056231665816]
We propose a framework for self-supervised evaluation of Large Language Models (LLMs)
We demonstrate self-supervised evaluation strategies for measuring closed-book knowledge, toxicity, and long-range context dependence.
We find strong correlations between self-supervised and human-supervised evaluations.
arXiv Detail & Related papers (2023-06-23T17:59:09Z) - A comprehensive comparative evaluation and analysis of Distributional
Semantic Models [61.41800660636555]
We perform a comprehensive evaluation of type distributional vectors, either produced by static DSMs or obtained by averaging the contextualized vectors generated by BERT.
The results show that the alleged superiority of predict based models is more apparent than real, and surely not ubiquitous.
We borrow from cognitive neuroscience the methodology of Representational Similarity Analysis (RSA) to inspect the semantic spaces generated by distributional models.
arXiv Detail & Related papers (2021-05-20T15:18:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.