MAQuA: Adaptive Question-Asking for Multidimensional Mental Health Screening using Item Response Theory
- URL: http://arxiv.org/abs/2508.07279v2
- Date: Mon, 01 Sep 2025 03:34:39 GMT
- Title: MAQuA: Adaptive Question-Asking for Multidimensional Mental Health Screening using Item Response Theory
- Authors: Vasudha Varadarajan, Hui Xu, Rebecca Astrid Boehme, Mariam Marlan Mirstrom, Sverker Sikstrom, H. Andrew Schwartz,
- Abstract summary: We introduce MAQuA, an adaptive question-asking framework for simultaneous, multidimensional mental health screening.<n>We show MAQuA reduces the number of assessment questions required for score stabilization by 50-87% compared to random ordering.<n> MAQuA demonstrates robust performance across both internalizing (depression, anxiety) and externalizing (substance use, eating disorder) domains.
- Score: 9.779175352907162
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in large language models (LLMs) offer new opportunities for scalable, interactive mental health assessment, but excessive querying by LLMs burdens users and is inefficient for real-world screening across transdiagnostic symptom profiles. We introduce MAQuA, an adaptive question-asking framework for simultaneous, multidimensional mental health screening. Combining multi-outcome modeling on language responses with item response theory (IRT) and factor analysis, MAQuA selects the questions with most informative responses across multiple dimensions at each turn to optimize diagnostic information, improving accuracy and potentially reducing response burden. Empirical results on a novel dataset reveal that MAQuA reduces the number of assessment questions required for score stabilization by 50-87% compared to random ordering (e.g., achieving stable depression scores with 71% fewer questions and eating disorder scores with 85% fewer questions). MAQuA demonstrates robust performance across both internalizing (depression, anxiety) and externalizing (substance use, eating disorder) domains, with early stopping strategies further reducing patient time and burden. These findings position MAQuA as a powerful and efficient tool for scalable, nuanced, and interactive mental health screening, advancing the integration of LLM-based agents into real-world clinical workflows.
Related papers
- MindEval: Benchmarking Language Models on Multi-turn Mental Health Support [10.524387723320432]
MindEval is a framework for automatically evaluating language models in realistic, multi-turn mental health therapy conversations.<n>We quantitatively validate the realism of our simulated patients against human-generated text and by demonstrating strong correlations between automatic and human expert judgments.<n>We evaluate 12 state-of-the-art LLMs and show that all models struggle, scoring below 4 out of 6 on average, with particular weaknesses in problematic AI-specific patterns of communication.
arXiv Detail & Related papers (2025-11-23T15:19:29Z) - GEMeX-RMCoT: An Enhanced Med-VQA Dataset for Region-Aware Multimodal Chain-of-Thought Reasoning [60.03671205298294]
Medical visual question answering aims to support clinical decision-making by enabling models to answer natural language questions based on medical images.<n>Current methods still suffer from limited answer reliability and poor interpretability.<n>This work first proposes a Region-Aware Multimodal Chain-of-Thought dataset, in which the process of producing an answer is preceded by a sequence of intermediate reasoning steps.
arXiv Detail & Related papers (2025-06-22T08:09:58Z) - TestAgent: An Adaptive and Intelligent Expert for Human Assessment [62.060118490577366]
We propose TestAgent, a large language model (LLM)-powered agent designed to enhance adaptive testing through interactive engagement.<n>TestAgent supports personalized question selection, captures test-takers' responses and anomalies, and provides precise outcomes through dynamic, conversational interactions.
arXiv Detail & Related papers (2025-06-03T16:07:54Z) - Reinforcing Question Answering Agents with Minimalist Policy Gradient Optimization [80.09112808413133]
Mujica is a planner that decomposes questions into acyclic graph of subquestions and a worker that resolves questions via retrieval and reasoning.<n>MyGO is a novel reinforcement learning method that replaces traditional policy updates with gradient Likelihood Maximum Estimation.<n> Empirical results across multiple datasets demonstrate the effectiveness of MujicaMyGO in enhancing multi-hop QA performance.
arXiv Detail & Related papers (2025-05-20T18:33:03Z) - Enhancing Depression Detection via Question-wise Modality Fusion [47.45016610508853]
Depression is a highly prevalent and disabling condition that incurs substantial personal and societal costs.<n>We propose a novel Question-wise Modality Fusion framework trained with a novel Imbalanced Ordinal Log-Loss function.
arXiv Detail & Related papers (2025-03-26T12:34:34Z) - Structured Outputs Enable General-Purpose LLMs to be Medical Experts [50.02627258858336]
Large language models (LLMs) often struggle with open-ended medical questions.<n>We propose a novel approach utilizing structured medical reasoning.<n>Our approach achieves the highest Factuality Score of 85.8, surpassing fine-tuned models.
arXiv Detail & Related papers (2025-03-05T05:24:55Z) - LlaMADRS: Prompting Large Language Models for Interview-Based Depression Assessment [75.44934940580112]
This study introduces LlaMADRS, a novel framework leveraging open-source Large Language Models (LLMs) to automate depression severity assessment.<n>We employ a zero-shot prompting strategy with carefully designed cues to guide the model in interpreting and scoring transcribed clinical interviews.<n>Our approach, tested on 236 real-world interviews, demonstrates strong correlations with clinician assessments.
arXiv Detail & Related papers (2025-01-07T08:49:04Z) - Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models [73.79091519226026]
Uncertainty of Thoughts (UoT) is an algorithm to augment large language models with the ability to actively seek information by asking effective questions.
In experiments on medical diagnosis, troubleshooting, and the 20 Questions game, UoT achieves an average performance improvement of 38.1% in the rate of successful task completion.
arXiv Detail & Related papers (2024-02-05T18:28:44Z) - Large Language Models Encode Clinical Knowledge [21.630872464930587]
Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation.
We propose a framework for human evaluation of model answers along multiple axes including factuality, precision, possible harm, and bias.
We show that comprehension, recall of knowledge, and medical reasoning improve with model scale and instruction prompt tuning.
arXiv Detail & Related papers (2022-12-26T14:28:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.