Use of probabilistic phrases in a coordination game: human versus GPT-4
- URL: http://arxiv.org/abs/2310.10544v3
- Date: Sat, 25 Nov 2023 19:12:02 GMT
- Title: Use of probabilistic phrases in a coordination game: human versus GPT-4
- Authors: Laurence T Maloney, Maria F Dal Martello, Vivian Fei and Valerie Ma
- Abstract summary: English speakers use probabilistic phrases such as likely to communicate information about the probability or likelihood of events.
We first assessed human ability to estimate the probability and the ambiguity of 23 probabilistic phrases in a coordination game.
We found that the median human participant and GPT4 assigned probability estimates that were in good agreement.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: English speakers use probabilistic phrases such as likely to communicate
information about the probability or likelihood of events. Communication is
successful to the extent that the listener grasps what the speaker means to
convey and, if communication is successful, individuals can potentially
coordinate their actions based on shared knowledge about uncertainty. We first
assessed human ability to estimate the probability and the ambiguity
(imprecision) of twenty-three probabilistic phrases in a coordination game in
two different contexts, investment advice and medical advice. We then had GPT4
(OpenAI), a Large Language Model, complete the same tasks as the human
participants. We found that the median human participant and GPT4 assigned
probability estimates that were in good agreement (proportions of variance
accounted for close to .90). GPT4's estimates of probability both in the
investment and Medical contexts were as close or closer to that of the human
participants as the human participants' estimates were to one another.
Estimates of probability for both the human participants and GPT4 were little
affected by context. In contrast, human and GPT4 estimates of ambiguity were
not in such good agreement.
Related papers
- Probabilistic Reasoning with LLMs for k-anonymity Estimation [23.16673184539629]
We introduce a novel numerical reasoning task under uncertainty, focusing on estimating the k-anonymity of user-generated documents containing privacy-sensitive information.
We propose BRANCH, which uses LLMs to factorize a joint probability distribution to estimate the k-value.
Our experiments show that this method successfully estimates the correct k-value 67% of the time, an 11% increase compared to GPT-4o chain-of-thought reasoning.
arXiv Detail & Related papers (2025-03-12T17:41:25Z) - GPT's Judgements Under Uncertainty [0.0]
We investigate whether biases inherent in human cognition, such as loss aversion, manifest in how GPT-4o judges and makes decisions in probabilistic scenarios.
By conducting experiments across nine cognitive biases, we demonstrate GPT-4o's contradicting approach while responding to prompts with similar underlying probability notations.
Our findings also reveal mixed performances with the AI demonstrating both human-like errors and statistically sound decisions, even as it goes through identical iterations of the same prompt.
arXiv Detail & Related papers (2024-09-26T05:34:00Z) - Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with People [20.95122915164433]
We propose an iterative method for simultaneously eliciting conversational tones and sentences.
We show how our approach can be used to create an interpretable representation of relations between conversational tones in humans and GPT-4.
arXiv Detail & Related papers (2024-06-06T17:26:00Z) - An Evaluation of Estimative Uncertainty in Large Language Models [3.04503073434724]
Estimative uncertainty has long been an area of study -- including by intelligence agencies like the CIA.
This study compares estimative uncertainty in commonly used large language models (LLMs) to that of humans, and to each other.
We show that LLMs like GPT-3.5 and GPT-4 align with human estimates for some, but not all, WEPs presented in English.
arXiv Detail & Related papers (2024-05-24T03:39:31Z) - On the Conversational Persuasiveness of Large Language Models: A Randomized Controlled Trial [10.770999939834985]
We analyze the effect of AI-driven persuasion in a controlled, harmless setting.
We found that participants who debated GPT-4 with access to their personal information had 81.7% higher odds of increased agreement with their opponents compared to participants who debated humans.
arXiv Detail & Related papers (2024-03-21T13:14:40Z) - Large Language Models for Psycholinguistic Plausibility Pretesting [47.1250032409564]
We investigate whether Language Models (LMs) can be used to generate plausibility judgements.
We find that GPT-4 plausibility judgements highly correlate with human judgements across the structures we examine.
We then test whether this correlation implies that LMs can be used instead of humans for pretesting.
arXiv Detail & Related papers (2024-02-08T07:20:02Z) - Automatically measuring speech fluency in people with aphasia: first
achievements using read-speech data [55.84746218227712]
This study aims at assessing the relevance of a signalprocessingalgorithm, initially developed in the field of language acquisition, for the automatic measurement of speech fluency.
arXiv Detail & Related papers (2023-08-09T07:51:40Z) - Probing neural language models for understanding of words of estimative
probability [21.072862529656287]
Words of estimative probability (WEP) are expressions of a statement's plausibility.
We measure the ability of neural language processing models to capture the consensual probability level associated to each WEP.
arXiv Detail & Related papers (2022-11-07T08:29:11Z) - Reconciling Individual Probability Forecasts [78.0074061846588]
We show that two parties who agree on the data cannot disagree on how to model individual probabilities.
We conclude that although individual probabilities are unknowable, they are contestable via a computationally and data efficient process.
arXiv Detail & Related papers (2022-09-04T20:20:35Z) - On the probability-quality paradox in language generation [76.69397802617064]
We analyze language generation through an information-theoretic lens.
We posit that human-like language should contain an amount of information close to the entropy of the distribution over natural strings.
arXiv Detail & Related papers (2022-03-31T17:43:53Z) - Partner Matters! An Empirical Study on Fusing Personas for Personalized
Response Selection in Retrieval-Based Chatbots [51.091235903442715]
This paper makes an attempt to explore the impact of utilizing personas that describe either self or partner speakers on the task of response selection.
Four persona fusion strategies are designed, which assume personas interact with contexts or responses in different ways.
Empirical studies on the Persona-Chat dataset show that the partner personas can improve the accuracy of response selection.
arXiv Detail & Related papers (2021-05-19T10:32:30Z) - Epidemic mitigation by statistical inference from contact tracing data [61.04165571425021]
We develop Bayesian inference methods to estimate the risk that an individual is infected.
We propose to use probabilistic risk estimation in order to optimize testing and quarantining strategies for the control of an epidemic.
Our approaches translate into fully distributed algorithms that only require communication between individuals who have recently been in contact.
arXiv Detail & Related papers (2020-09-20T12:24:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.