Related papers: Psychometrics in Behavioral Software Engineering: A Methodological Introduction with Guidelines

Psychometrics in Behavioral Software Engineering: A Methodological Introduction with Guidelines

URL: http://arxiv.org/abs/2005.09959v4
Date: Tue, 8 Jun 2021 13:10:37 GMT
Title: Psychometrics in Behavioral Software Engineering: A Methodological Introduction with Guidelines
Authors: Daniel Graziotin, Per Lenberg, Robert Feldt, Stefan Wagner
Abstract summary: We provide an introduction to psychometric theory for the evaluation of measurement instruments for software engineering researchers. We detail activities used when operationalizing new psychological constructs, such as item pooling, item review, pilot testing, item analysis, factor analysis, statistical property of items, reliability, validity, and fairness in testing and test bias. We hope to encourage a culture change in SE research towards the adoption of established methods from psychology.
Score: 19.40714760075466
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A meaningful and deep understanding of the human aspects of software engineering (SE) requires psychological constructs to be considered. Psychology theory can facilitate the systematic and sound development as well as the adoption of instruments (e.g., psychological tests, questionnaires) to assess these constructs. In particular, to ensure high quality, the psychometric properties of instruments need evaluation. In this paper, we provide an introduction to psychometric theory for the evaluation of measurement instruments for SE researchers. We present guidelines that enable using existing instruments and developing new ones adequately. We conducted a comprehensive review of the psychology literature framed by the Standards for Educational and Psychological Testing. We detail activities used when operationalizing new psychological constructs, such as item pooling, item review, pilot testing, item analysis, factor analysis, statistical property of items, reliability, validity, and fairness in testing and test bias. We provide an openly available example of a psychometric evaluation based on our guideline. We hope to encourage a culture change in SE research towards the adoption of established methods from psychology. To improve the quality of behavioral research in SE, studies focusing on introducing, validating, and then using psychometric instruments need to be more common.

Related papers

TestAgent: An Adaptive and Intelligent Expert for Human Assessment [62.060118490577366]
We propose TestAgent, a large language model (LLM)-powered agent designed to enhance adaptive testing through interactive engagement.<n>TestAgent supports personalized question selection, captures test-takers' responses and anomalies, and provides precise outcomes through dynamic, conversational interactions.
arXiv Detail & Related papers (2025-06-03T16:07:54Z)
Large Language Model Psychometrics: A Systematic Review of Evaluation, Validation, and Enhancement [16.608577295968942]
Review paper introduces and synthesizes the emerging interdisciplinary field of LLM Psychometrics.<n>Psychometrics quantify intangible aspects of human psychology, such as personality, values, and intelligence.<n>Ultimately, the review provides actionable insights for developing future evaluation paradigms that align with human-level AI.
arXiv Detail & Related papers (2025-05-13T05:47:51Z)
Measuring Mental Health Variables in Computational Research: Toward Validated, Dimensional, and Transdiagnostic Approaches [6.796386356785538]
Computational mental health research develops models to predict and understand psychological phenomena, but often relies on inappropriate measures of psychopathology constructs. We identify three key issues: (1) reliance on unvalidated measures over validated ones; (2) treating mental health constructs as categorical rather than dimensional; and (3) focusing on disorder-specific constructs instead of transdiagnostic ones.
arXiv Detail & Related papers (2025-04-04T21:11:41Z)
Measuring the Mental Health of Content Reviewers, a Systematic Review [50.06646946044604]
Many workers report long-term, potentially irreversible psychological harm. This work is similar to activities that cause psychological harm to other kinds of helping professionals even after small doses of exposure. This systematic review summarizes psychological measures from other professions and relates them to the experiences of content reviewers.
arXiv Detail & Related papers (2025-02-01T00:50:15Z)
Are LLMs effective psychological assessors? Leveraging adaptive RAG for interpretable mental health screening through psychometric practice [2.9775344067885974]
We propose a novel adaptive Retrieval-Augmented Generation (RAG) approach that completes psychological questionnaires by analyzing social media posts. Our method retrieves the most relevant user posts for each question in a psychological survey and uses Large Language Models (LLMs) to predict questionnaire scores in a zero-shot setting.
arXiv Detail & Related papers (2025-01-02T00:01:54Z)
Assessment and manipulation of latent constructs in pre-trained language models using psychometric scales [4.805861461250903]
We show how standard psychological questionnaires can be reformulated into natural language inference prompts. We demonstrate, using a sample of 88 publicly available models, the existence of human-like mental health-related constructs.
arXiv Detail & Related papers (2024-09-29T11:00:41Z)
Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models [57.518784855080334]
Large Language Models (LLMs) have demonstrated exceptional task-solving capabilities, increasingly adopting roles akin to human-like assistants. This paper presents a framework for investigating psychology dimension in LLMs, including psychological identification, assessment dataset curation, and assessment with results validation. We introduce a comprehensive psychometrics benchmark for LLMs that covers six psychological dimensions: personality, values, emotion, theory of mind, motivation, and intelligence.
arXiv Detail & Related papers (2024-06-25T16:09:08Z)
PsychoGAT: A Novel Psychological Measurement Paradigm through Interactive Fiction Games with LLM Agents [68.50571379012621]
Psychological measurement is essential for mental health, self-understanding, and personal development. PsychoGAT (Psychological Game AgenTs) achieves statistically significant excellence in psychometric metrics such as reliability, convergent validity, and discriminant validity.
arXiv Detail & Related papers (2024-02-19T18:00:30Z)
Precision psychiatry: predicting predictability [0.0]
I review ten challenges in the field of precision psychiatry. Need for studies on real-world populations and realistic clinical outcome definitions. Consider treatment-related factors such as placebo effects and non-adherence to prescriptions.
arXiv Detail & Related papers (2023-06-21T13:10:46Z)
From Static Benchmarks to Adaptive Testing: Psychometrics in AI Evaluation [60.14902811624433]
We discuss a paradigm shift from static evaluation methods to adaptive testing. This involves estimating the characteristics and value of each test item in the benchmark and dynamically adjusting items in real-time. We analyze the current approaches, advantages, and underlying reasons for adopting psychometrics in AI evaluation.
arXiv Detail & Related papers (2023-06-18T09:54:33Z)
Concepts and Experiments on Psychoanalysis Driven Computing [0.0]
This research investigates the effective incorporation of the human factor and user perception in text-based interactive media. We use the notion of Lacanian discourse types to capture and deeply understand real characteristics, qualities and contents of texts. This is the first time computational methods are systematically combined with psychoanalysis.
arXiv Detail & Related papers (2022-09-29T19:27:22Z)
Interpretability by design using computer vision for behavioral sensing in child and adolescent psychiatry [3.975358343371988]
We use machine learning to derive behavioral codes or concepts of a gold standard behavioral rating system. Our ratings were comparable to human expert ratings for negative emotions, activity-level/arousal and anxiety.
arXiv Detail & Related papers (2022-07-11T09:07:08Z)
Evaluating and Inducing Personality in Pre-trained Language Models [78.19379997967191]
We draw inspiration from psychometric studies by leveraging human personality theory as a tool for studying machine behaviors. To answer these questions, we introduce the Machine Personality Inventory (MPI) tool for studying machine behaviors. MPI follows standardized personality tests, built upon the Big Five Personality Factors (Big Five) theory and personality assessment inventories. We devise a Personality Prompting (P2) method to induce LLMs with specific personalities in a controllable way.
arXiv Detail & Related papers (2022-05-20T07:32:57Z)
AGENT: A Benchmark for Core Psychological Reasoning [60.35621718321559]
Intuitive psychology is the ability to reason about hidden mental variables that drive observable actions. Despite recent interest in machine agents that reason about other agents, it is not clear if such agents learn or hold the core psychology principles that drive human reasoning. We present a benchmark consisting of procedurally generated 3D animations, AGENT, structured around four scenarios.
arXiv Detail & Related papers (2021-02-24T14:58:23Z)
Opportunities of a Machine Learning-based Decision Support System for Stroke Rehabilitation Assessment [64.52563354823711]
Rehabilitation assessment is critical to determine an adequate intervention for a patient. Current practices of assessment mainly rely on therapist's experience, and assessment is infrequently executed due to the limited availability of a therapist. We developed an intelligent decision support system that can identify salient features of assessment using reinforcement learning.
arXiv Detail & Related papers (2020-02-27T17:04:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.