Twin-2K-500: A dataset for building digital twins of over 2,000 people based on their answers to over 500 questions
- URL: http://arxiv.org/abs/2505.17479v1
- Date: Fri, 23 May 2025 05:05:11 GMT
- Title: Twin-2K-500: A dataset for building digital twins of over 2,000 people based on their answers to over 500 questions
- Authors: Olivier Toubia, George Z. Gui, Tianyi Peng, Daniel J. Merlau, Ang Li, Haozhe Chen,
- Abstract summary: LLM-based digital twin simulation holds great promise for research in AI, social science, and digital experimentation.<n>We survey a representative sample of $N = 2,058$ participants (average 2.42 hours per person) in the US across four waves with 500 questions in total.<n>Initial analyses suggest the data are of high quality and show promise for constructing digital twins that predict human behavior well at the individual and aggregate levels.
- Score: 11.751234495886674
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: LLM-based digital twin simulation, where large language models are used to emulate individual human behavior, holds great promise for research in AI, social science, and digital experimentation. However, progress in this area has been hindered by the scarcity of real, individual-level datasets that are both large and publicly available. This lack of high-quality ground truth limits both the development and validation of digital twin methodologies. To address this gap, we introduce a large-scale, public dataset designed to capture a rich and holistic view of individual human behavior. We survey a representative sample of $N = 2,058$ participants (average 2.42 hours per person) in the US across four waves with 500 questions in total, covering a comprehensive battery of demographic, psychological, economic, personality, and cognitive measures, as well as replications of behavioral economics experiments and a pricing survey. The final wave repeats tasks from earlier waves to establish a test-retest accuracy baseline. Initial analyses suggest the data are of high quality and show promise for constructing digital twins that predict human behavior well at the individual and aggregate levels. By making the full dataset publicly available, we aim to establish a valuable testbed for the development and benchmarking of LLM-based persona simulations. Beyond LLM applications, due to its unique breadth and scale the dataset also enables broad social science research, including studies of cross-construct correlations and heterogeneous treatment effects.
Related papers
- Pay What LLM Wants: Can LLM Simulate Economics Experiment with 522 Real-human Persona? [1.931250555574267]
We evaluate Large Language Models' ability to predict individual economic decision-making using Pay-What-You-Want pricing experiments with real 522 human personas.<n>Results reveal that while LLMs struggle with precise individual-level predictions, they demonstrate reasonable group-level behavioral tendencies.
arXiv Detail & Related papers (2025-08-05T09:37:37Z) - LLM Generated Persona is a Promise with a Catch [18.45442859688198]
Persona-based simulations hold promise for transforming disciplines that rely on population-level feedback.<n>Traditional methods to collect realistic persona data face challenges.<n>They are prohibitively expensive and logistically challenging due to privacy constraints.
arXiv Detail & Related papers (2025-03-18T03:11:27Z) - How Far are LLMs from Being Our Digital Twins? A Benchmark for Persona-Based Behavior Chain Simulation [30.713599131902566]
We introduce BehaviorChain, the first benchmark for evaluating digital twins' ability to simulate continuous human behavior.<n> BehaviorChain comprises diverse, high-quality, persona-based behavior chains, totaling 15,846 distinct behaviors across 1,001 unique personas.<n> Comprehensive evaluation results demonstrated that even state-of-the-art models struggle with accurately simulating continuous human behavior.
arXiv Detail & Related papers (2025-02-20T15:29:32Z) - From Individual to Society: A Survey on Social Simulation Driven by Large Language Model-based Agents [47.935533238820334]
Traditional sociological research often relies on human participation, which, though effective, is expensive, challenging to scale, and with ethical concerns.<n>Recent advancements in large language models (LLMs) highlight their potential to simulate human behavior, enabling the replication of individual responses and facilitating studies on many interdisciplinary studies.<n>We categorize the simulations into three types: (1) Individual Simulation, which mimics specific individuals or demographic groups; (2) Scenario Simulation, where multiple agents collaborate to achieve goals within specific contexts; and (3) Simulation Society, which models interactions within agent societies to reflect the complexity and variety of real-world dynamics.
arXiv Detail & Related papers (2024-12-04T18:56:37Z) - Generative Agent Simulations of 1,000 People [56.82159813294894]
We present a novel agent architecture that simulates the attitudes and behaviors of 1,052 real individuals.
The generative agents replicate participants' responses on the General Social Survey 85% as accurately as participants replicate their own answers.
Our architecture reduces accuracy biases across racial and ideological groups compared to agents given demographic descriptions.
arXiv Detail & Related papers (2024-11-15T11:14:34Z) - Synthesizing Post-Training Data for LLMs through Multi-Agent Simulation [51.20656279478878]
MATRIX is a multi-agent simulator that automatically generates diverse text-based scenarios.<n>We introduce MATRIX-Gen for controllable and highly realistic data synthesis.<n>On AlpacaEval 2 and Arena-Hard benchmarks, Llama-3-8B-Base, post-trained on datasets synthesized by MATRIX-Gen with just 20K instruction-response pairs, outperforms Meta's Llama-3-8B-Instruct model.
arXiv Detail & Related papers (2024-10-18T08:01:39Z) - Agentic Society: Merging skeleton from real world and texture from Large Language Model [4.740886789811429]
This paper explores a novel framework that leverages census data and large language models to generate virtual populations.
We show that our method produces personas with variability essential for simulating diverse human behaviors in social science experiments.
But the evaluation result shows that only weak sign of statistical truthfulness can be produced due to limited capability of current LLMs.
arXiv Detail & Related papers (2024-09-02T08:28:19Z) - Data Augmentation in Human-Centric Vision [54.97327269866757]
This survey presents a comprehensive analysis of data augmentation techniques in human-centric vision tasks.
It delves into a wide range of research areas including person ReID, human parsing, human pose estimation, and pedestrian detection.
Our work categorizes data augmentation methods into two main types: data generation and data perturbation.
arXiv Detail & Related papers (2024-03-13T16:05:18Z) - DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation [83.30006900263744]
Data analysis is a crucial analytical process to generate in-depth studies and conclusive insights.
We propose to automatically generate high-quality answer annotations leveraging the code-generation capabilities of LLMs.
Our DACO-RL algorithm is evaluated by human annotators to produce more helpful answers than SFT model in 57.72% cases.
arXiv Detail & Related papers (2024-03-04T22:47:58Z) - FiFAR: A Fraud Detection Dataset for Learning to Defer [9.187694794359498]
We introduce the Financial Fraud Alert Review dataset (FiFAR), a synthetic bank account fraud detection dataset.
FiFAR contains the predictions of a team of 50 highly complex and varied synthetic fraud analysts, with varied bias and feature dependence.
We use our dataset to develop a capacity-aware L2D method and rejection learning approach under realistic data availability conditions.
arXiv Detail & Related papers (2023-12-20T17:36:36Z) - SynBody: Synthetic Dataset with Layered Human Models for 3D Human
Perception and Modeling [93.60731530276911]
We introduce a new synthetic dataset, SynBody, with three appealing features.
The dataset comprises 1.2M images with corresponding accurate 3D annotations, covering 10,000 human body models, 1,187 actions, and various viewpoints.
arXiv Detail & Related papers (2023-03-30T13:30:12Z) - PeopleSansPeople: A Synthetic Data Generator for Human-Centric Computer
Vision [3.5694949627557846]
We release a human-centric synthetic data generator PeopleSansPeople.
It contains simulation-ready 3D human assets, a parameterized lighting and camera system, and generates 2D and 3D bounding box, instance and semantic segmentation, and COCO pose labels.
arXiv Detail & Related papers (2021-12-17T02:33:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.