Related papers: Human Choice Prediction in Language-based Persuasion Games: Simulation-based Off-Policy Evaluation

Human Choice Prediction in Language-based Persuasion Games: Simulation-based Off-Policy Evaluation

URL: http://arxiv.org/abs/2305.10361v4
Date: Wed, 28 Feb 2024 21:36:54 GMT
Title: Human Choice Prediction in Language-based Persuasion Games: Simulation-based Off-Policy Evaluation
Authors: Eilam Shapira, Reut Apel, Moshe Tennenholtz, Roi Reichart
Abstract summary: This paper addresses a key aspect in the design of such agents: Predicting human decision in off-policy evaluation. We collected a dataset of 87K decisions from humans playing a repeated decision-making game with artificial agents. Our approach involves training a model on human interactions with one agents subset to predict decisions when interacting with another.
Score: 24.05034588588407
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in Large Language Models (LLMs) have spurred interest in designing LLM-based agents for tasks that involve interaction with human and artificial agents. This paper addresses a key aspect in the design of such agents: Predicting human decision in off-policy evaluation (OPE), focusing on language-based persuasion games, where the agent's goal is to influence its partner's decisions through verbal messages. Using a dedicated application, we collected a dataset of 87K decisions from humans playing a repeated decision-making game with artificial agents. Our approach involves training a model on human interactions with one agents subset to predict decisions when interacting with another. To enhance off-policy performance, we propose a simulation technique involving interactions across the entire agent space and simulated decision makers. Our learning strategy yields significant OPE gains, e.g., improving prediction accuracy in the top 15% challenging cases by 7.1%. Our code and the large dataset we collected and generated are submitted as supplementary material and publicly available in our GitHub repository: https://github.com/eilamshapira/HumanChoicePrediction

Related papers

FAIRGAME: a Framework for AI Agents Bias Recognition using Game Theory [51.96049148869987]
We present FAIRGAME, a Framework for AI Agents Bias Recognition using Game Theory. We describe its implementation and usage, and we employ it to uncover biased outcomes in popular games among AI agents. Overall, FAIRGAME allows users to reliably and easily simulate their desired games and scenarios.
arXiv Detail & Related papers (2025-04-19T15:29:04Z)
Simulating User Agents for Embodied Conversational-AI [9.402740034754455]
We build a large language model (LLM)-based user agent that can simulate user behavior during interactions with an embodied agent. We evaluate our user agent's ability to generate human-like behaviors by comparing its simulated dialogues with the TEACh dataset.
arXiv Detail & Related papers (2024-10-31T00:56:08Z)
Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance [95.03771007780976]
We tackle the challenge of developing proactive agents capable of anticipating and initiating tasks without explicit human instructions. First, we collect real-world human activities to generate proactive task predictions. These predictions are labeled by human annotators as either accepted or rejected. The labeled data is used to train a reward model that simulates human judgment.
arXiv Detail & Related papers (2024-10-16T08:24:09Z)
IDAT: A Multi-Modal Dataset and Toolkit for Building and Evaluating Interactive Task-Solving Agents [20.460482488872145]
This paper addresses the challenges of developing interactive agents capable of understanding and executing grounded natural language instructions. We introduce a scalable data collection tool for gathering interactive grounded language instructions within a Minecraft-like environment. We present a Human-in-the-Loop interactive evaluation platform for qualitative analysis and comparison of agent performance.
arXiv Detail & Related papers (2024-07-12T00:07:43Z)
ADESSE: Advice Explanations in Complex Repeated Decision-Making Environments [14.105935964906976]
This work considers a problem setup where an intelligent agent provides advice to a human decision-maker. We develop an approach named ADESSE to generate explanations about the adviser agent to improve human trust and decision-making.
arXiv Detail & Related papers (2024-05-31T08:59:20Z)
PLAYER*: Enhancing LLM-based Multi-Agent Communication and Interaction in Murder Mystery Games [21.639516389561837]
PLAYER* is a novel framework for Large Language Model (LLM)-based agents in Murder Mystery Games (MMGs) MMGs pose unique challenges, including undefined state spaces, absent intermediate rewards, and the need for strategic interaction in a continuous language domain. PLAYER* addresses these complexities through a sensor-based representation of agent states, a question-targeting mechanism guided by information gain, and a pruning strategy to refine suspect lists and enhance decision-making efficiency.
arXiv Detail & Related papers (2024-04-26T19:07:30Z)
Modeling Boundedly Rational Agents with Latent Inference Budgets [56.24971011281947]
We introduce a latent inference budget model (L-IBM) that models agents' computational constraints explicitly. L-IBMs make it possible to learn agent models using data from diverse populations of suboptimal actors. We show that L-IBMs match or outperform Boltzmann models of decision-making under uncertainty.
arXiv Detail & Related papers (2023-12-07T03:55:51Z)
AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems [112.76941157194544]
We propose AgentCF for simulating user-item interactions in recommender systems through agent-based collaborative filtering. We creatively consider not only users but also items as agents, and develop a collaborative learning approach that optimize both kinds of agents together. Overall, the optimized agents exhibit diverse interaction behaviors within our framework, including user-item, user-user, item-item, and collective interactions.
arXiv Detail & Related papers (2023-10-13T16:37:14Z)
ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate [57.71597869337909]
We build a multi-agent referee team called ChatEval to autonomously discuss and evaluate the quality of generated responses from different models. Our analysis shows that ChatEval transcends mere textual scoring, offering a human-mimicking evaluation process for reliable assessments.
arXiv Detail & Related papers (2023-08-14T15:13:04Z)
Development of a Trust-Aware User Simulator for Statistical Proactive Dialog Modeling in Human-AI Teams [4.384546153204966]
The concept of a Human-AI team has gained increasing attention in recent years. For effective collaboration between humans and AI teammates, proactivity is crucial for close coordination and effective communication. We present the development of a corpus-based user simulator for training and testing proactive dialog policies.
arXiv Detail & Related papers (2023-04-24T08:42:51Z)
ASPEST: Bridging the Gap Between Active Learning and Selective Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain. Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples. In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z)
JECC: Commonsense Reasoning Tasks Derived from Interactive Fictions [75.42526766746515]
We propose a new commonsense reasoning dataset based on human's Interactive Fiction (IF) gameplay walkthroughs. Our dataset focuses on the assessment of functional commonsense knowledge rules rather than factual knowledge. Experiments show that the introduced dataset is challenging to previous machine reading models as well as the new large language models.
arXiv Detail & Related papers (2022-10-18T19:20:53Z)
Investigations of Performance and Bias in Human-AI Teamwork in Hiring [30.046502708053097]
In AI-assisted decision-making, effective hybrid teamwork (human-AI) is not solely dependent on AI performance alone. We investigate how both a model's predictive performance and bias may transfer to humans in a recommendation-aided decision task.
arXiv Detail & Related papers (2022-02-21T17:58:07Z)
Model-based Reinforcement Learning for Decentralized Multiagent Rendezvous [66.6895109554163]
Underlying the human ability to align goals with other agents is their ability to predict the intentions of others and actively update their own plans. We propose hierarchical predictive planning (HPP), a model-based reinforcement learning method for decentralized multiagent rendezvous.
arXiv Detail & Related papers (2020-03-15T19:49:20Z)
SPA: Verbal Interactions between Agents and Avatars in Shared Virtual Environments using Propositional Planning [61.335252950832256]
Sense-Plan-Ask, or SPA, generates plausible verbal interactions between virtual human-like agents and user avatars in shared virtual environments. We find that our algorithm creates a small runtime cost and enables agents to complete their goals more effectively than agents without the ability to leverage natural-language communication.
arXiv Detail & Related papers (2020-02-08T23:15:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.