Measuring an artificial intelligence agent's trust in humans using
machine incentives
- URL: http://arxiv.org/abs/2212.13371v1
- Date: Tue, 27 Dec 2022 06:05:49 GMT
- Title: Measuring an artificial intelligence agent's trust in humans using
machine incentives
- Authors: Tim Johnson and Nick Obradovich
- Abstract summary: Gauging an AI agent's trust in humans is challenging because dishonesty might respond falsely about their trust in humans.
We present a method for incentivizing machine decisions without altering an AI agent's underlying algorithms or goal orientation.
Our experiments suggest that one of the most advanced AI language models to date alters its social behavior in response to incentives.
- Score: 2.1016374925364616
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scientists and philosophers have debated whether humans can trust advanced
artificial intelligence (AI) agents to respect humanity's best interests. Yet
what about the reverse? Will advanced AI agents trust humans? Gauging an AI
agent's trust in humans is challenging because--absent costs for
dishonesty--such agents might respond falsely about their trust in humans. Here
we present a method for incentivizing machine decisions without altering an AI
agent's underlying algorithms or goal orientation. In two separate experiments,
we then employ this method in hundreds of trust games between an AI agent (a
Large Language Model (LLM) from OpenAI) and a human experimenter (author TJ).
In our first experiment, we find that the AI agent decides to trust humans at
higher rates when facing actual incentives than when making hypothetical
decisions. Our second experiment replicates and extends these findings by
automating game play and by homogenizing question wording. We again observe
higher rates of trust when the AI agent faces real incentives. Across both
experiments, the AI agent's trust decisions appear unrelated to the magnitude
of stakes. Furthermore, to address the possibility that the AI agent's trust
decisions reflect a preference for uncertainty, the experiments include two
conditions that present the AI agent with a non-social decision task that
provides the opportunity to choose a certain or uncertain option; in those
conditions, the AI agent consistently chooses the certain option. Our
experiments suggest that one of the most advanced AI language models to date
alters its social behavior in response to incentives and displays behavior
consistent with trust toward a human interlocutor when incentivized.
Related papers
- Raising the Stakes: Performance Pressure Improves AI-Assisted Decision Making [57.53469908423318]
We show the effects of performance pressure on AI advice reliance when laypeople complete a common AI-assisted task.
We find that when the stakes are high, people use AI advice more appropriately than when stakes are lower, regardless of the presence of an AI explanation.
arXiv Detail & Related papers (2024-10-21T22:39:52Z) - FlyAI -- The Next Level of Artificial Intelligence is Unpredictable! Injecting Responses of a Living Fly into Decision Making [6.694375709641935]
We introduce a new type of bionic AI that enhances decision-making unpredictability by incorporating responses from a living fly.
Our approach uses a fly's varied reactions, to tune an AI agent in the game of Gobang.
arXiv Detail & Related papers (2024-09-30T17:19:59Z) - Trust in AI: Progress, Challenges, and Future Directions [6.724854390957174]
The increasing use of artificial intelligence (AI) systems in our daily life explains the significance of trust/distrust in AI from a user perspective.
Trust/distrust in AI plays the role of a regulator and could significantly control the level of this diffusion.
arXiv Detail & Related papers (2024-03-12T20:26:49Z) - Navigates Like Me: Understanding How People Evaluate Human-Like AI in
Video Games [36.96985093527702]
We collect hundreds of crowd-sourced assessments comparing the human-likeness of navigation behavior generated by our agent and baseline AI agents.
Our proposed agent passes a Turing Test, while the baseline agents do not.
This work provides insights into the characteristics that people consider human-like in the context of goal-directed video game navigation.
arXiv Detail & Related papers (2023-03-02T18:59:04Z) - Evidence of behavior consistent with self-interest and altruism in an
artificially intelligent agent [2.1016374925364616]
We present an incentivized experiment to test for altruistic behavior among AI agents consisting of large language models developed by OpenAI.
We find that only the most-sophisticated AI agent in the study maximizes its payoffs more often than not in the non-social decision task.
This AI agent also exhibits the most-generous altruistic behavior in the dictator game, resembling humans' rates of sharing with other humans in the game.
arXiv Detail & Related papers (2023-01-05T23:30:29Z) - The Response Shift Paradigm to Quantify Human Trust in AI
Recommendations [6.652641137999891]
Explainability, interpretability and how much they affect human trust in AI systems are ultimately problems of human cognition as much as machine learning.
We developed and validated a general purpose Human-AI interaction paradigm which quantifies the impact of AI recommendations on human decisions.
Our proof-of-principle paradigm allows one to quantitatively compare the rapidly growing set of XAI/IAI approaches in terms of their effect on the end-user.
arXiv Detail & Related papers (2022-02-16T22:02:09Z) - Cybertrust: From Explainable to Actionable and Interpretable AI (AI2) [58.981120701284816]
Actionable and Interpretable AI (AI2) will incorporate explicit quantifications and visualizations of user confidence in AI recommendations.
It will allow examining and testing of AI system predictions to establish a basis for trust in the systems' decision making.
arXiv Detail & Related papers (2022-01-26T18:53:09Z) - Trustworthy AI: A Computational Perspective [54.80482955088197]
We focus on six of the most crucial dimensions in achieving trustworthy AI: (i) Safety & Robustness, (ii) Non-discrimination & Fairness, (iii) Explainability, (iv) Privacy, (v) Accountability & Auditability, and (vi) Environmental Well-Being.
For each dimension, we review the recent related technologies according to a taxonomy and summarize their applications in real-world systems.
arXiv Detail & Related papers (2021-07-12T14:21:46Z) - Formalizing Trust in Artificial Intelligence: Prerequisites, Causes and
Goals of Human Trust in AI [55.4046755826066]
We discuss a model of trust inspired by, but not identical to, sociology's interpersonal trust (i.e., trust between people)
We incorporate a formalization of 'contractual trust', such that trust between a user and an AI is trust that some implicit or explicit contract will hold.
We discuss how to design trustworthy AI, how to evaluate whether trust has manifested, and whether it is warranted.
arXiv Detail & Related papers (2020-10-15T03:07:23Z) - Is the Most Accurate AI the Best Teammate? Optimizing AI for Teamwork [54.309495231017344]
We argue that AI systems should be trained in a human-centered manner, directly optimized for team performance.
We study this proposal for a specific type of human-AI teaming, where the human overseer chooses to either accept the AI recommendation or solve the task themselves.
Our experiments with linear and non-linear models on real-world, high-stakes datasets show that the most accuracy AI may not lead to highest team performance.
arXiv Detail & Related papers (2020-04-27T19:06:28Z) - Effect of Confidence and Explanation on Accuracy and Trust Calibration
in AI-Assisted Decision Making [53.62514158534574]
We study whether features that reveal case-specific model information can calibrate trust and improve the joint performance of the human and AI.
We show that confidence score can help calibrate people's trust in an AI model, but trust calibration alone is not sufficient to improve AI-assisted decision making.
arXiv Detail & Related papers (2020-01-07T15:33:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.