Related papers: Tricking LLM-Based NPCs into Spilling Secrets

Tricking LLM-Based NPCs into Spilling Secrets

URL: http://arxiv.org/abs/2508.19288v1
Date: Mon, 25 Aug 2025 05:25:28 GMT
Title: Tricking LLM-Based NPCs into Spilling Secrets
Authors: Kyohei Shiomi, Zhuotao Lian, Toru Nakanishi, Teruaki Kitasuka,
Abstract summary: Large Language Models (LLMs) are increasingly used to generate dynamic dialogue for game NPCs.<n>In this study, we examine whether adversarial prompt injection can cause LLM-based NPCs to reveal hidden background secrets.
Score: 0.6999740786886536
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) are increasingly used to generate dynamic dialogue for game NPCs. However, their integration raises new security concerns. In this study, we examine whether adversarial prompt injection can cause LLM-based NPCs to reveal hidden background secrets that are meant to remain undisclosed.

Related papers

Under the Influence: Quantifying Persuasion and Vigilance in Large Language Models [13.754658024896612]
We study the abilities of Large Language Models to persuade and be rationally vigilant towards other LLM agents.<n>We find that puzzle-solving performance, persuasive capability, and vigilance are dissociable capacities in LLMs.<n>Our work presents the first investigation of the relationship between persuasion, vigilance, and task performance in LLMs.
arXiv Detail & Related papers (2026-02-24T04:09:21Z)
Should You Use Your Large Language Model to Explore or Exploit? [55.562545113247666]
We evaluate the ability of large language models to help a decision-making agent facing an exploration-exploitation tradeoff.<n>We find that while the current LLMs often struggle to exploit, in-context mitigations may be used to substantially improve performance for small-scale tasks.
arXiv Detail & Related papers (2025-01-31T23:42:53Z)
NewsInterview: a Dataset and a Playground to Evaluate LLMs' Ground Gap via Informational Interviews [65.35458530702442]
We focus on journalistic interviews, a domain rich in grounding communication and abundant in data. We curate a dataset of 40,000 two-person informational interviews from NPR and CNN. LLMs are significantly less likely than human interviewers to use acknowledgements and to pivot to higher-level questions.
arXiv Detail & Related papers (2024-11-21T01:37:38Z)
Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models [85.13298925375692]
Large language models (LLMs) encode vast amounts of knowledge during pre-training. LLMs can be enhanced by incorporating contextual knowledge (CK) Can LLMs effectively integrate their internal PK with external CK to solve complex problems?
arXiv Detail & Related papers (2024-10-10T23:09:08Z)
Collaborative Quest Completion with LLM-driven Non-Player Characters in Minecraft [14.877848057734463]
We design a minigame within Minecraft where a player works with two GPT4-driven NPCs to complete a quest. On analyzing the game logs and recordings, we find that several patterns of collaborative behavior emerge from the NPCs and the human players. We believe that this preliminary study and analysis will inform future game developers on how to better exploit these rapidly improving generative AI models for collaborative roles in games.
arXiv Detail & Related papers (2024-07-03T19:11:21Z)
A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly [21.536079040559517]
Large Language Models (LLMs) have revolutionized natural language understanding and generation. This paper explores the intersection of LLMs with security and privacy.
arXiv Detail & Related papers (2023-12-04T16:25:18Z)
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively. Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z)
Privacy in Large Language Models: Attacks, Defenses and Future Directions [84.73301039987128]
We analyze the current privacy attacks targeting large language models (LLMs) and categorize them according to the adversary's assumed capabilities. We present a detailed overview of prominent defense strategies that have been developed to counter these privacy attacks.
arXiv Detail & Related papers (2023-10-16T13:23:54Z)
Multi-step Jailbreaking Privacy Attacks on ChatGPT [47.10284364632862]
We study the privacy threats from OpenAI's ChatGPT and the New Bing enhanced by ChatGPT. We conduct extensive experiments to support our claims and discuss LLMs' privacy implications.
arXiv Detail & Related papers (2023-04-11T13:05:04Z)
Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection [64.67495502772866]
Large Language Models (LLMs) are increasingly being integrated into various applications. We show how attackers can override original instructions and employed controls using Prompt Injection attacks. We derive a comprehensive taxonomy from a computer security perspective to systematically investigate impacts and vulnerabilities.
arXiv Detail & Related papers (2023-02-23T17:14:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.