Neural network embeddings recover value dimensions from psychometric survey items on par with human data
- URL: http://arxiv.org/abs/2509.24906v1
- Date: Mon, 29 Sep 2025 15:14:54 GMT
- Title: Neural network embeddings recover value dimensions from psychometric survey items on par with human data
- Authors: Max Pellert, Clemens M. Lechner, Indira Sen, Markus Strohmaier,
- Abstract summary: "Survey and Questionnaire Item Embeddings Differentials" (SQuID) is a novel methodological approach that enables neural network embeddings to recover latent dimensions from psychometric survey items.<n>We demonstrate that embeddings derived from large language models, when processed with SQuID, can recover the structure of human values obtained from human rater judgments.
- Score: 7.5591367381052175
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study introduces "Survey and Questionnaire Item Embeddings Differentials" (SQuID), a novel methodological approach that enables neural network embeddings to effectively recover latent dimensions from psychometric survey items. We demonstrate that embeddings derived from large language models, when processed with SQuID, can recover the structure of human values obtained from human rater judgments on the Revised Portrait Value Questionnaire (PVQ-RR). Our experimental validation compares multiple embedding models across a number of evaluation metrics. Unlike previous approaches, SQuID successfully addresses the challenge of obtaining negative correlations between dimensions without requiring domain-specific fine-tuning. Quantitative analysis reveals that our embedding-based approach explains 55% of variance in dimension-dimension similarities compared to human data. Multidimensional scaling configurations from both types of data show fair factor congruence coefficients and largely follow the underlying theory. These results demonstrate that semantic embeddings can effectively replicate psychometric structures previously established through extensive human surveys. The approach offers substantial advantages in cost, scalability and flexibility while maintaining comparable quality to traditional methods. Our findings have significant implications for psychometrics and social science research, providing a complementary methodology that could expand the scope of human behavior and experience represented in measurement tools.
Related papers
- Investigating the Impact of Histopathological Foundation Models on Regressive Prediction of Homologous Recombination Deficiency [52.50039435394964]
We systematically evaluate foundation models for regression-based tasks.<n>We extract patch-level features from whole slide images (WSI) using five state-of-the-art foundation models.<n>Models are trained to predict continuous HRD scores based on these extracted features across breast, endometrial, and lung cancer cohorts.
arXiv Detail & Related papers (2026-01-29T14:06:50Z) - On metric choice in dimension reduction for Fréchet regression [7.161207910629032]
Fr'echet regression is becoming a mainstay in modern data analysis for analyzing non-traditional data types.
It is especially useful in the analysis of complex health data such as continuous monitoring and imaging data.
arXiv Detail & Related papers (2024-10-02T17:39:34Z) - Latent Variable Sequence Identification for Cognitive Models with Neural Network Estimators [7.7227297059345466]
We present an approach that extends neural Bayes estimation to learn a direct mapping between experimental data and the targeted latent variable space.<n>Our work underscores that combining recurrent neural networks and simulation-based inference to identify latent variable sequences can enable researchers to access a wider class of cognitive models.
arXiv Detail & Related papers (2024-06-20T21:13:39Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - Decoding Susceptibility: Modeling Misbelief to Misinformation Through a Computational Approach [61.04606493712002]
Susceptibility to misinformation describes the degree of belief in unverifiable claims that is not observable.
Existing susceptibility studies heavily rely on self-reported beliefs.
We propose a computational approach to model users' latent susceptibility levels.
arXiv Detail & Related papers (2023-11-16T07:22:56Z) - Simulation-based Inference for Cardiovascular Models [43.55219268578912]
We use simulation-based inference to solve the inverse problem of mapping waveforms back to plausible physiological parameters.<n>We perform an in-silico uncertainty analysis of five biomarkers of clinical interest.<n>We study the gap between in-vivo and in-silico with the MIMIC-III waveform database.
arXiv Detail & Related papers (2023-07-26T02:34:57Z) - Pain level and pain-related behaviour classification using GRU-based
sparsely-connected RNNs [61.080598804629375]
People with chronic pain unconsciously adapt specific body movements to protect themselves from injury or additional pain.
Because there is no dedicated benchmark database to analyse this correlation, we considered one of the specific circumstances that potentially influence a person's biometrics during daily activities.
We proposed a sparsely-connected recurrent neural networks (s-RNNs) ensemble with the gated recurrent unit (GRU) that incorporates multiple autoencoders.
We conducted several experiments which indicate that the proposed method outperforms the state-of-the-art approaches in classifying both pain level and pain-related behaviour.
arXiv Detail & Related papers (2022-12-20T12:56:28Z) - Causal Inference via Nonlinear Variable Decorrelation for Healthcare
Applications [60.26261850082012]
We introduce a novel method with a variable decorrelation regularizer to handle both linear and nonlinear confounding.
We employ association rules as new representations using association rule mining based on the original features to increase model interpretability.
arXiv Detail & Related papers (2022-09-29T17:44:14Z) - Mixed Effects Neural ODE: A Variational Approximation for Analyzing the
Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data.
We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem.
We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z) - Improving Prediction of Cognitive Performance using Deep Neural Networks
in Sparse Data [2.867517731896504]
We used data from an observational, cohort study, Midlife in the United States (MIDUS) to model executive function and episodic memory measures.
Deep neural network (DNN) models consistently ranked highest in all of the cognitive performance prediction tasks.
arXiv Detail & Related papers (2021-12-28T22:23:08Z) - Robust High-Dimensional Regression with Coefficient Thresholding and its
Application to Imaging Data Analysis [7.640041402805495]
It is of importance to develop statistical techniques to analyze high-dimensional data in the presence of both complex dependence and possible outliers in real-world imaging data.
arXiv Detail & Related papers (2021-09-30T05:29:54Z) - DeepCOVIDNet: An Interpretable Deep Learning Model for Predictive
Surveillance of COVID-19 Using Heterogeneous Features and their Interactions [2.30238915794052]
We propose a deep learning model to forecast the range of increase in COVID-19 infected cases in future days.
Using data collected from various sources, we estimate the range of increase in infected cases seven days into the future for all U.S. counties.
arXiv Detail & Related papers (2020-07-31T23:37:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.