Deception Abilities Emerged in Large Language Models
- URL: http://arxiv.org/abs/2307.16513v2
- Date: Fri, 2 Feb 2024 12:16:12 GMT
- Title: Deception Abilities Emerged in Large Language Models
- Authors: Thilo Hagendorff
- Abstract summary: Large language models (LLMs) are currently at the forefront of intertwining artificial intelligence (AI) systems with human communication and everyday life.
This study reveals that such strategies emerged in state-of-the-art LLMs, such as GPT-4, but were non-existent in earlier LLMs.
We conduct a series of experiments showing that state-of-the-art LLMs are able to understand and induce false beliefs in other agents.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) are currently at the forefront of intertwining
artificial intelligence (AI) systems with human communication and everyday
life. Thus, aligning them with human values is of great importance. However,
given the steady increase in reasoning abilities, future LLMs are under
suspicion of becoming able to deceive human operators and utilizing this
ability to bypass monitoring efforts. As a prerequisite to this, LLMs need to
possess a conceptual understanding of deception strategies. This study reveals
that such strategies emerged in state-of-the-art LLMs, such as GPT-4, but were
non-existent in earlier LLMs. We conduct a series of experiments showing that
state-of-the-art LLMs are able to understand and induce false beliefs in other
agents, that their performance in complex deception scenarios can be amplified
utilizing chain-of-thought reasoning, and that eliciting Machiavellianism in
LLMs can alter their propensity to deceive. In sum, revealing hitherto unknown
machine behavior in LLMs, our study contributes to the nascent field of machine
psychology.
Related papers
- Perceptions to Beliefs: Exploring Precursory Inferences for Theory of Mind in Large Language Models [51.91448005607405]
We evaluate perception inference and perception-to-belief inference in large language models (LLMs)
We present PercepToM, a novel ToM method leveraging LLMs' strong perception inference capability while supplementing their limited perception-to-belief inference.
arXiv Detail & Related papers (2024-07-08T14:58:29Z) - Should We Fear Large Language Models? A Structural Analysis of the Human
Reasoning System for Elucidating LLM Capabilities and Risks Through the Lens
of Heidegger's Philosophy [0.0]
This study investigates the capabilities and risks of Large Language Models (LLMs)
It uses the innovative parallels between the statistical patterns of word relationships within LLMs and Martin Heidegger's concepts of "ready-to-hand" and "present-at-hand"
Our findings reveal that while LLMs possess the capability for Direct Explicative Reasoning and Pseudo Rational Reasoning, they fall short in authentic rational reasoning and have no creative reasoning capabilities.
arXiv Detail & Related papers (2024-03-05T19:40:53Z) - When Do LLMs Need Retrieval Augmentation? Mitigating LLMs' Overconfidence Helps Retrieval Augmentation [66.01754585188739]
Large Language Models (LLMs) have been found to have difficulty knowing they do not possess certain knowledge.
Retrieval Augmentation (RA) has been extensively studied to mitigate LLMs' hallucinations.
We propose several methods to enhance LLMs' perception of knowledge boundaries and show that they are effective in reducing overconfidence.
arXiv Detail & Related papers (2024-02-18T04:57:19Z) - Rethinking Interpretability in the Era of Large Language Models [76.1947554386879]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide array of tasks.
The capability to explain in natural language allows LLMs to expand the scale and complexity of patterns that can be given to a human.
These new capabilities raise new challenges, such as hallucinated explanations and immense computational costs.
arXiv Detail & Related papers (2024-01-30T17:38:54Z) - Large Language Models: The Need for Nuance in Current Debates and a
Pragmatic Perspective on Understanding [1.3654846342364308]
Large Language Models (LLMs) are unparalleled in their ability to generate grammatically correct, fluent text.
This position paper critically assesses three points recurring in critiques of LLM capacities.
We outline a pragmatic perspective on the issue of real' understanding and intentionality in LLMs.
arXiv Detail & Related papers (2023-10-30T15:51:04Z) - Avalon's Game of Thoughts: Battle Against Deception through Recursive
Contemplation [80.126717170151]
This study utilizes the intricate Avalon game as a testbed to explore LLMs' potential in deceptive environments.
We introduce a novel framework, Recursive Contemplation (ReCon), to enhance LLMs' ability to identify and counteract deceptive information.
arXiv Detail & Related papers (2023-10-02T16:27:36Z) - Generative AI vs. AGI: The Cognitive Strengths and Weaknesses of Modern
LLMs [0.0]
It is argued that incremental improvement of such LLMs is not a viable approach to working toward human-level AGI.
Social and ethical matters regarding LLMs are very briefly touched from this perspective.
arXiv Detail & Related papers (2023-09-19T07:12:55Z) - Siren's Song in the AI Ocean: A Survey on Hallucination in Large
Language Models [116.01843550398183]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks.
LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge.
arXiv Detail & Related papers (2023-09-03T16:56:48Z) - AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap [46.98582021477066]
The rise of powerful large language models (LLMs) brings about tremendous opportunities for innovation but also looming risks for individuals and society at large.
We have reached a pivotal moment for ensuring that LLMs and LLM-infused applications are developed and deployed responsibly.
It is paramount to pursue new approaches to provide transparency for LLMs.
arXiv Detail & Related papers (2023-06-02T22:51:26Z) - Thinking Fast and Slow in Large Language Models [0.08057006406834465]
Large language models (LLMs) are currently at the forefront of intertwining AI systems with human communication and everyday life.
In this study, we show that LLMs like GPT-3 exhibit behavior that resembles human-like intuition - and the cognitive errors that come with it.
arXiv Detail & Related papers (2022-12-10T05:07:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.