PersonaX: Multimodal Datasets with LLM-Inferred Behavior Traits
- URL: http://arxiv.org/abs/2509.11362v1
- Date: Sun, 14 Sep 2025 17:30:03 GMT
- Title: PersonaX: Multimodal Datasets with LLM-Inferred Behavior Traits
- Authors: Loka Li, Wong Yu Kang, Minghao Fu, Guangyi Chen, Zhenhao Chen, Gongxu Luo, Yuewen Sun, Salman Khan, Peter Spirtes, Kun Zhang,
- Abstract summary: We present PersonaX, a curated collection of multimodal datasets designed to enable comprehensive analysis of public traits across modalities.<n> PersonaX consists of (1) CelebPersona, featuring 9444 public figures from diverse occupations, and (2) AthlePersona, covering 4181 professional athletes across 7 major sports leagues.<n>Each dataset includes behavioral trait assessments inferred by three high-performing large language models, alongside facial imagery and structured biographical features.
- Score: 30.425825274563536
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Understanding human behavior traits is central to applications in human-computer interaction, computational social science, and personalized AI systems. Such understanding often requires integrating multiple modalities to capture nuanced patterns and relationships. However, existing resources rarely provide datasets that combine behavioral descriptors with complementary modalities such as facial attributes and biographical information. To address this gap, we present PersonaX, a curated collection of multimodal datasets designed to enable comprehensive analysis of public traits across modalities. PersonaX consists of (1) CelebPersona, featuring 9444 public figures from diverse occupations, and (2) AthlePersona, covering 4181 professional athletes across 7 major sports leagues. Each dataset includes behavioral trait assessments inferred by three high-performing large language models, alongside facial imagery and structured biographical features. We analyze PersonaX at two complementary levels. First, we abstract high-level trait scores from text descriptions and apply five statistical independence tests to examine their relationships with other modalities. Second, we introduce a novel causal representation learning (CRL) framework tailored to multimodal and multi-measurement data, providing theoretical identifiability guarantees. Experiments on both synthetic and real-world data demonstrate the effectiveness of our approach. By unifying structured and unstructured analysis, PersonaX establishes a foundation for studying LLM-inferred behavioral traits in conjunction with visual and biographical attributes, advancing multimodal trait analysis and causal reasoning.
Related papers
- Enhancing Personality Recognition by Comparing the Predictive Power of Traits, Facets, and Nuances [37.83859643892549]
Personality recognition models aim to infer personality traits from different sources of behavioral data.<n>We trained a transformer-based model including cross-modal (audiovisual) and cross-subject (dyad-aware) attention mechanisms.<n>Results show that nuance-level models consistently outperform facet and trait-level models, reducing mean squared error by up to 74% across interaction scenarios.
arXiv Detail & Related papers (2026-02-05T13:35:04Z) - HumanLLM: Towards Personalized Understanding and Simulation of Human Nature [72.55730315685837]
HumanLLM is a foundation model designed for personalized understanding and simulation of individuals.<n>We first construct the Cognitive Genome, a large-scale corpus curated from real-world user data on platforms like Reddit, Twitter, Blogger, and Amazon.<n>We then formulate diverse learning tasks and perform supervised fine-tuning to empower the model to predict a wide range of individualized human behaviors, thoughts, and experiences.
arXiv Detail & Related papers (2026-01-22T09:27:27Z) - TwinVoice: A Multi-dimensional Benchmark Towards Digital Twins via LLM Persona Simulation [55.55404595177229]
Large Language Models (LLMs) are exhibiting emergent human-like abilities.<n>TwinVoice is a benchmark for assessing persona simulation across diverse real-world contexts.
arXiv Detail & Related papers (2025-10-29T14:00:42Z) - Combo-Gait: Unified Transformer Framework for Multi-Modal Gait Recognition and Attribute Analysis [35.9240903956677]
We propose a framework that combines 2D temporal silhouettes with 3D SMPL features for robust gait analysis.<n>Beyond identification, we introduce a multitask learning strategy that jointly performs gait recognition and human attribute estimation.<n>Our approach outperforms state-of-the-art methods in gait recognition and provides accurate human attribute estimation.
arXiv Detail & Related papers (2025-10-12T02:56:40Z) - PersonaTwin: A Multi-Tier Prompt Conditioning Framework for Generating and Evaluating Personalized Digital Twins [20.77710199900999]
We introduce PersonaTwin, a multi-tier prompt conditioning framework that builds adaptive digital twins.<n>Using a comprehensive data set in the healthcare context of more than 8,500 individuals, we benchmark PersonaTwin against standard LLM outputs.<n> Experimental results show that our framework produces simulation fidelity on par with settings.
arXiv Detail & Related papers (2025-07-30T04:57:30Z) - BookWorm: A Dataset for Character Description and Analysis [59.186325346763184]
We define two tasks: character description, which generates a brief factual profile, and character analysis, which offers an in-depth interpretation.
We introduce the BookWorm dataset, pairing books from the Gutenberg Project with human-written descriptions and analyses.
Our findings show that retrieval-based approaches outperform hierarchical ones in both tasks.
arXiv Detail & Related papers (2024-10-14T10:55:58Z) - MMSci: A Dataset for Graduate-Level Multi-Discipline Multimodal Scientific Understanding [59.41495657570397]
We present a comprehensive dataset compiled from Nature Communications articles covering 72 scientific fields.<n>We evaluated 19 proprietary and open-source models on two benchmark tasks, figure captioning and multiple-choice, and conducted human expert annotation.<n>Fine-tuning Qwen2-VL-7B with our task-specific data achieved better performance than GPT-4o and even human experts in multiple-choice evaluations.
arXiv Detail & Related papers (2024-07-06T00:40:53Z) - From Persona to Personalization: A Survey on Role-Playing Language Agents [52.783043059715546]
Recent advancements in large language models (LLMs) have boosted the rise of Role-Playing Language Agents (RPLAs)
RPLAs achieve a remarkable sense of human likeness and vivid role-playing performance.
They have catalyzed numerous AI applications, such as emotional companions, interactive video games, personalized assistants and copilots.
arXiv Detail & Related papers (2024-04-28T15:56:41Z) - Personality-aware Human-centric Multimodal Reasoning: A New Task,
Dataset and Baselines [32.82738983843281]
We introduce a new task called Personality-aware Human-centric Multimodal Reasoning (PHMR) (T1)
The goal of the task is to forecast the future behavior of a particular individual using multimodal information from past instances, while integrating personality factors.
The experimental results demonstrate that incorporating personality traits enhances human-centric multimodal reasoning performance.
arXiv Detail & Related papers (2023-04-05T09:09:10Z) - Domain-specific Learning of Multi-scale Facial Dynamics for Apparent
Personality Traits Prediction [3.19935268158731]
We propose a novel video-based automatic personality traits recognition approach.
It consists of: (1) a textbfdomain-specific facial behavior modelling module that extracts personality-related multi-scale short-term human facial behavior features; (2) a textbflong-term behavior modelling module that summarizes all short-term features of a video as a long-term/video-level personality representation; and (3) a textbfmulti-task personality traits prediction module that models underlying relationship among all traits and jointly predict them based on the video-level personality representation.
arXiv Detail & Related papers (2022-09-09T07:08:55Z) - Vyaktitv: A Multimodal Peer-to-Peer Hindi Conversations based Dataset
for Personality Assessment [50.15466026089435]
We present a novel peer-to-peer Hindi conversation dataset- Vyaktitv.
It consists of high-quality audio and video recordings of the participants, with Hinglish textual transcriptions for each conversation.
The dataset also contains a rich set of socio-demographic features, like income, cultural orientation, amongst several others, for all the participants.
arXiv Detail & Related papers (2020-08-31T17:44:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.