Limits of Large Language Models in Debating Humans
- URL: http://arxiv.org/abs/2402.06049v1
- Date: Tue, 6 Feb 2024 03:24:27 GMT
- Title: Limits of Large Language Models in Debating Humans
- Authors: James Flamino, Mohammed Shahid Modi, Boleslaw K. Szymanski, Brendan
Cross, Colton Mikolajczyk
- Abstract summary: Large Language Models (LLMs) have shown remarkable promise in their ability to interact proficiently with humans.
This paper endeavors to test the limits of current-day LLMs with a pre-registered study integrating real people with LLM agents acting as people.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have shown remarkable promise in their ability
to interact proficiently with humans. Subsequently, their potential use as
artificial confederates and surrogates in sociological experiments involving
conversation is an exciting prospect. But how viable is this idea? This paper
endeavors to test the limits of current-day LLMs with a pre-registered study
integrating real people with LLM agents acting as people. The study focuses on
debate-based opinion consensus formation in three environments: humans only,
agents and humans, and agents only. Our goal is to understand how LLM agents
influence humans, and how capable they are in debating like humans. We find
that LLMs can blend in and facilitate human productivity but are less
convincing in debate, with their behavior ultimately deviating from human's. We
elucidate these primary failings and anticipate that LLMs must evolve further
before being viable debaters.
Related papers
- Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations [58.65755268815283]
Many real dialogues are interactive, meaning an agent's utterances will influence their conversational partner, elicit information, or change their opinion.
We use this fact to rewrite and augment existing suboptimal data, and train via offline reinforcement learning (RL) an agent that outperforms both prompting and learning from unaltered human demonstrations.
Our results in a user study with real humans show that our approach greatly outperforms existing state-of-the-art dialogue agents.
arXiv Detail & Related papers (2024-11-07T21:37:51Z) - A little less conversation, a little more action, please: Investigating the physical common-sense of LLMs in a 3D embodied environment [0.9188951403098383]
Large Language Models (LLMs) are increasingly used as reasoning engines in agentic systems.
We present the first embodied and cognitively meaningful evaluation of physical common-sense reasoning in LLMs.
We employ the Animal-AI environment, a simulated 3D virtual laboratory, to study physical common-sense reasoning in LLMs.
arXiv Detail & Related papers (2024-10-30T17:28:28Z) - Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina [7.155982875107922]
Studies suggest large language models (LLMs) can exhibit human-like reasoning, aligning with human behavior in economic experiments, surveys, and political discourse.
This has led many to propose that LLMs can be used as surrogates or simulations for humans in social science research.
We assess the reasoning depth of LLMs using the 11-20 money request game.
arXiv Detail & Related papers (2024-10-25T14:46:07Z) - Can Language Models Recognize Convincing Arguments? [12.458437450959416]
Large language models (LLMs) have raised concerns about their potential to create and propagate convincing narratives.
We study their performance in detecting convincing arguments to gain insights into their persuasive capabilities.
arXiv Detail & Related papers (2024-03-31T17:38:33Z) - How should the advent of large language models affect the practice of
science? [51.62881233954798]
How should the advent of large language models affect the practice of science?
We have invited four diverse groups of scientists to reflect on this query, sharing their perspectives and engaging in debate.
arXiv Detail & Related papers (2023-12-05T10:45:12Z) - Large Language Models: The Need for Nuance in Current Debates and a
Pragmatic Perspective on Understanding [1.3654846342364308]
Large Language Models (LLMs) are unparalleled in their ability to generate grammatically correct, fluent text.
This position paper critically assesses three points recurring in critiques of LLM capacities.
We outline a pragmatic perspective on the issue of real' understanding and intentionality in LLMs.
arXiv Detail & Related papers (2023-10-30T15:51:04Z) - BotChat: Evaluating LLMs' Capabilities of Having Multi-Turn Dialogues [72.65163468440434]
This report provides a preliminary evaluation of existing large language models for human-style multi-turn chatting.
We prompt large language models (LLMs) to generate a full multi-turn dialogue based on the ChatSEED, utterance by utterance.
We find GPT-4 can generate human-style multi-turn dialogues with impressive quality, significantly outperforms its counterparts.
arXiv Detail & Related papers (2023-10-20T16:53:51Z) - Character-LLM: A Trainable Agent for Role-Playing [67.35139167985008]
Large language models (LLMs) can be used to serve as agents to simulate human behaviors.
We introduce Character-LLM that teach LLMs to act as specific people such as Beethoven, Queen Cleopatra, Julius Caesar, etc.
arXiv Detail & Related papers (2023-10-16T07:58:56Z) - Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate [85.3444184685235]
We propose a Multi-Agent Debate (MAD) framework, in which multiple agents express their arguments in the state of "tit for tat" and a judge manages the debate process to obtain a final solution.
Our framework encourages divergent thinking in LLMs which would be helpful for tasks that require deep levels of contemplation.
arXiv Detail & Related papers (2023-05-30T15:25:45Z) - Can Large Language Models Transform Computational Social Science? [79.62471267510963]
Large Language Models (LLMs) are capable of performing many language processing tasks zero-shot (without training data)
This work provides a road map for using LLMs as Computational Social Science tools.
arXiv Detail & Related papers (2023-04-12T17:33:28Z) - Are LLMs the Master of All Trades? : Exploring Domain-Agnostic Reasoning
Skills of LLMs [0.0]
This study aims to investigate the performance of large language models (LLMs) on different reasoning tasks.
My findings indicate that LLMs excel at analogical and moral reasoning, yet struggle to perform as proficiently on spatial reasoning tasks.
arXiv Detail & Related papers (2023-03-22T22:53:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.