Related papers: ALIGNS: Unlocking nomological networks in psychological measurement through a large language model

ALIGNS: Unlocking nomological networks in psychological measurement through a large language model

URL: http://arxiv.org/abs/2509.09723v2
Date: Thu, 18 Sep 2025 16:46:59 GMT
Title: ALIGNS: Unlocking nomological networks in psychological measurement through a large language model
Authors: Kai R. Larsen, Sen Yan, Roland M. Mueller, Lan Sang, Mikko Rönkkö, Ravi Starzl, Donald Edmondson,
Abstract summary: We introduce Analysis of Latent Indicators to Generate Nomological Structures (ALIGNS), a large language model-based system trained with validated questionnaire measures.<n>ALIGNS provides three comprehensive nomological networks containing over 550,000 indicators across psychology, medicine, social policy, and other fields.<n>This represents the first application of large language models to solve a foundational problem in measurement validation.
Score: 0.9696659544494058
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Psychological measurement is critical to many disciplines. Despite advances in measurement, building nomological networks, theoretical maps of how concepts and measures relate to establish validity, remains a challenge 70 years after Cronbach and Meehl proposed them as fundamental to validation. This limitation has practical consequences: clinical trials may fail to detect treatment effects, and public policy may target the wrong outcomes. We introduce Analysis of Latent Indicators to Generate Nomological Structures (ALIGNS), a large language model-based system trained with validated questionnaire measures. ALIGNS provides three comprehensive nomological networks containing over 550,000 indicators across psychology, medicine, social policy, and other fields. This represents the first application of large language models to solve a foundational problem in measurement validation. We report classification accuracy tests used to develop the model, as well as three evaluations. In the first evaluation, the widely used NIH PROMIS anxiety and depression instruments are shown to converge into a single dimension of emotional distress. The second evaluation examines child temperament measures and identifies four potential dimensions not captured by current frameworks, and questions one existing dimension. The third evaluation, an applicability check, engages expert psychometricians who assess the system's importance, accessibility, and suitability. ALIGNS is freely available at nomologicalnetwork.org, complementing traditional validation methods with large-scale nomological analysis.

Related papers

Benchmarking Egocentric Clinical Intent Understanding Capability for Medical Multimodal Large Language Models [48.95516224614331]
We introduce MedGaze-Bench, the first benchmark leveraging clinician gaze as a Cognitive Cursor to assess intent understanding across surgery, emergency simulation, and diagnostic interpretation.<n>Our benchmark addresses three fundamental challenges: visual homogeneity of anatomical structures, strict temporal-causal dependencies in clinical, and implicit adherence to safety protocols.
arXiv Detail & Related papers (2026-01-11T02:20:40Z)
Quantifying Data Contamination in Psychometric Evaluations of LLMs [13.528776782604107]
We propose a framework to measure data contamination in psychometric evaluations of Large Language Models (LLMs)<n>Applying this framework to 21 models from major families and four widely used psychometric inventories, we provide evidence that popular inventories exhibit strong contamination.
arXiv Detail & Related papers (2025-10-08T16:16:20Z)
AI Models for Depressive Disorder Detection and Diagnosis: A Review [0.9012198585960441]
Major Depressive Disorder is one of the leading causes of disability worldwide, yet its diagnosis still depends largely on subjective clinical assessments.<n>In this paper, we present a comprehensive survey of state-of-the-art AI methods for detection and diagnosis, based on a systematic review of 55 key studies.
arXiv Detail & Related papers (2025-08-16T11:46:48Z)
Medical Reasoning in the Era of LLMs: A Systematic Review of Enhancement Techniques and Applications [59.721265428780946]
Large Language Models (LLMs) in medicine have enabled impressive capabilities, yet a critical gap remains in their ability to perform systematic, transparent, and verifiable reasoning.<n>This paper provides the first systematic review of this emerging field.<n>We propose a taxonomy of reasoning enhancement techniques, categorized into training-time strategies and test-time mechanisms.
arXiv Detail & Related papers (2025-08-01T14:41:31Z)
A validity-guided workflow for robust large language model research in psychology [0.0]
Large language models (LLMs) are rapidly being integrated into psychological research as research tools, evaluation targets, human simulators, and cognitive models.<n>These "measurement phantoms"--statistical artifacts masquerading as psychological phenomena--threaten the validity of a growing body of research.<n>Guided by the dual-validity framework that integrates psychometrics with causal inference, we present a six-stage workflow that scales validity requirements to research ambition.
arXiv Detail & Related papers (2025-07-06T18:06:12Z)
From Prompts to Constructs: A Dual-Validity Framework for LLM Research in Psychology [0.0]
We argue that building a robust science of AI psychology requires integrating the principles of reliable measurement and the standards for sound causal inference.<n>We present a dual-validity framework to guide this integration, which clarifies how the evidence needed to support a claim scales with its scientific ambition.
arXiv Detail & Related papers (2025-06-20T02:38:42Z)
Measuring Mental Health Variables in Computational Research: Toward Validated, Dimensional, and Transdiagnostic Approaches [6.796386356785538]
Computational mental health research develops models to predict and understand psychological phenomena, but often relies on inappropriate measures of psychopathology constructs.<n>We identify three key issues: (1) reliance on unvalidated measures over validated ones; (2) treating mental health constructs as categorical rather than dimensional; and (3) focusing on disorder-specific constructs instead of transdiagnostic ones.
arXiv Detail & Related papers (2025-04-04T21:11:41Z)
Evaluating Generative AI Systems is a Social Science Measurement Challenge [78.35388859345056]
We present a framework for measuring concepts related to the capabilities, impacts, opportunities, and risks of GenAI systems. The framework distinguishes between four levels: the background concept, the systematized concept, the measurement instrument(s), and the instance-level measurements themselves.
arXiv Detail & Related papers (2024-11-17T02:35:30Z)
Long-Range Biometric Identification in Real World Scenarios: A Comprehensive Evaluation Framework Based on Missions [11.557368031775717]
This paper evaluates research solutions for identifying individuals at ranges and altitudes. By fusing face and body features, we propose developing robust biometric systems for effective long-range identification.
arXiv Detail & Related papers (2024-09-03T02:17:36Z)
Evaluating Large Language Models with Psychometrics [59.821829073478376]
This paper offers a comprehensive benchmark for quantifying psychological constructs of Large Language Models (LLMs)<n>Our work identifies five key psychological constructs -- personality, values, emotional intelligence, theory of mind, and self-efficacy -- assessed through a suite of 13 datasets.<n>We uncover significant discrepancies between LLMs' self-reported traits and their response patterns in real-world scenarios, revealing complexities in their behaviors.
arXiv Detail & Related papers (2024-06-25T16:09:08Z)
Geometric Deep Learning for Structure-Based Drug Design: A Survey [83.87489798671155]
Structure-based drug design (SBDD) leverages the three-dimensional geometry of proteins to identify potential drug candidates. Recent advancements in geometric deep learning, which effectively integrate and process 3D geometric data, have significantly propelled the field forward.
arXiv Detail & Related papers (2023-06-20T14:21:58Z)
Position: AI Evaluation Should Learn from How We Test Humans [65.36614996495983]
We argue that psychometrics, a theory originating in the 20th century for human assessment, could be a powerful solution to the challenges in today's AI evaluations.
arXiv Detail & Related papers (2023-06-18T09:54:33Z)
Affect Analysis in-the-wild: Valence-Arousal, Expressions, Action Units and a Unified Framework [83.21732533130846]
The paper focuses on large in-the-wild databases, i.e., Aff-Wild and Aff-Wild2. It presents the design of two classes of deep neural networks trained with these databases. A novel multi-task and holistic framework is presented which is able to jointly learn and effectively generalize and perform affect recognition.
arXiv Detail & Related papers (2021-03-29T17:36:20Z)
Pose-based Body Language Recognition for Emotion and Psychiatric Symptom Interpretation [75.3147962600095]
We propose an automated framework for body language based emotion recognition starting from regular RGB videos. In collaboration with psychologists, we extend the framework for psychiatric symptom prediction. Because a specific application domain of the proposed framework may only supply a limited amount of data, the framework is designed to work on a small training set.
arXiv Detail & Related papers (2020-10-30T18:45:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.