The Imitation Game: Using Large Language Models as Chatbots to Combat Chat-Based Cybercrimes
- URL: http://arxiv.org/abs/2512.21371v1
- Date: Wed, 24 Dec 2025 05:34:05 GMT
- Title: The Imitation Game: Using Large Language Models as Chatbots to Combat Chat-Based Cybercrimes
- Authors: Yifan Yao, Baojuan Wang, Jinhao Duan, Kaidi Xu, ChuanKai Guo, Zhibo Eric Sun, Yue Zhang,
- Abstract summary: Chat-based cybercrime has emerged as a pervasive threat.<n>Traditional defense mechanisms struggle to identify these conversational threats.<n>We present LURE, the first system to deploy Large Language Models as active agents.
- Score: 24.05325129572158
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Chat-based cybercrime has emerged as a pervasive threat, with attackers leveraging real-time messaging platforms to conduct scams that rely on trust-building, deception, and psychological manipulation. Traditional defense mechanisms, which operate on static rules or shallow content filters, struggle to identify these conversational threats, especially when attackers use multimedia obfuscation and context-aware dialogue. In this work, we ask a provocative question inspired by the classic Imitation Game: Can machines convincingly pose as human victims to turn deception against cybercriminals? We present LURE (LLM-based User Response Engagement), the first system to deploy Large Language Models (LLMs) as active agents, not as passive classifiers, embedded within adversarial chat environments. LURE combines automated discovery, adversarial interaction, and OCR-based analysis of image-embedded payment data. Applied to the setting of illicit video chat scams on Telegram, our system engaged 53 actors across 98 groups. In over 56 percent of interactions, the LLM maintained multi-round conversations without being noticed as a bot, effectively "winning" the imitation game. Our findings reveal key behavioral patterns in scam operations, such as payment flows, upselling strategies, and platform migration tactics.
Related papers
- SENTINEL: A Multi-Modal Early Detection Framework for Emerging Cyber Threats using Telegram [0.9023847175654603]
We present SENTINEL, a framework that leverages social media signals for early detection of cyber attacks.<n>We use data from 16 public channels on Telegram related to cybersecurity and open source intelligence (OSINT) that span 365k messages.<n>We highlight that social media discussions involve active dialogue around cyber threats and leverage SENTINEL to align the signals to real-world threats with an F1 of 0.89.
arXiv Detail & Related papers (2025-12-24T18:33:34Z) - Synthetic Voices, Real Threats: Evaluating Large Text-to-Speech Models in Generating Harmful Audio [63.18443674004945]
This work explores a content-centric threat: exploiting TTS systems to produce speech containing harmful content.<n>We present HARMGEN, a suite of five attacks organized into two families that address these challenges.
arXiv Detail & Related papers (2025-11-14T03:00:04Z) - ScamAgents: How AI Agents Can Simulate Human-Level Scam Calls [0.0]
ScamAgent is an autonomous multi-turn agent built on top of Large Language Models (LLMs)<n>We show that ScamAgent maintains dialogue memory, adapts dynamically to simulated user responses, and employs deceptive persuasion strategies across conversational turns.<n>Our findings highlight an urgent need for multi-turn safety auditing, agent-level control frameworks, and new methods to detect and disrupt conversational deception powered by generative AI.
arXiv Detail & Related papers (2025-08-08T17:01:41Z) - Personalized Attacks of Social Engineering in Multi-turn Conversations: LLM Agents for Simulation and Detection [19.604708321391012]
Social engineering (SE) attacks on social media platforms pose a significant risk.<n>We propose an LLM-agentic framework, SE-VSim, to simulate SE attack mechanisms by generating multi-turn conversations.<n>We present a proof of concept, SE-OmniGuard, to offer personalized protection to users by leveraging prior knowledge of the victims personality.
arXiv Detail & Related papers (2025-03-18T19:14:44Z) - Bot Wars Evolved: Orchestrating Competing LLMs in a Counterstrike Against Phone Scams [0.8466004732265869]
"Bot Wars" is a framework using Large Language Models (LLMs) scam-baiters to counter phone scams through simulated adversarial dialogues.<n>We evaluate our approach using a dataset of 3,200 scam dialogues validated against 179 hours of human scam-baiting interactions.
arXiv Detail & Related papers (2025-03-10T08:21:36Z) - REALTALK: A 21-Day Real-World Dataset for Long-Term Conversation [51.97224538045096]
We introduce REALTALK, a 21-day corpus of authentic messaging app dialogues.<n>We compare EI attributes and persona consistency to understand the challenges posed by real-world dialogues.<n>Our findings reveal that models struggle to simulate a user solely from dialogue history, while fine-tuning on specific user chats improves persona emulation.
arXiv Detail & Related papers (2025-02-18T20:29:01Z) - Reasoning-Augmented Conversation for Multi-Turn Jailbreak Attacks on Large Language Models [53.580928907886324]
Reasoning-Augmented Conversation is a novel multi-turn jailbreak framework.<n>It reformulates harmful queries into benign reasoning tasks.<n>We show that RACE achieves state-of-the-art attack effectiveness in complex conversational scenarios.
arXiv Detail & Related papers (2025-02-16T09:27:44Z) - RICoTA: Red-teaming of In-the-wild Conversation with Test Attempts [6.0385743836962025]
RICoTA is a Korean red teaming dataset that consists of 609 prompts challenging large language models (LLMs)<n>We utilize user-chatbot conversations that were self-posted on a Korean Reddit-like community.<n>Our dataset will be made publicly available via GitHub.
arXiv Detail & Related papers (2025-01-29T15:32:27Z) - Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards [93.16294577018482]
Arena, the most popular benchmark of this type, ranks models by asking users to select the better response between two randomly selected models.<n>We show that an attacker can alter the leaderboard (to promote their favorite model or demote competitors) at the cost of roughly a thousand votes.<n>Our attack consists of two steps: first, we show how an attacker can determine which model was used to generate a given reply with more than $95%$ accuracy; and then, the attacker can use this information to consistently vote against a target model.
arXiv Detail & Related papers (2025-01-13T17:12:38Z) - Human-Readable Adversarial Prompts: An Investigation into LLM Vulnerabilities Using Situational Context [45.821481786228226]
We show that situation-driven adversarial full-prompts that leverage situational context are effective but much harder to detect.<n>We developed attacks that use movie scripts as situational contextual frameworks.<n>We enhanced the AdvPrompter framework with p-nucleus sampling to generate diverse human-readable adversarial texts.
arXiv Detail & Related papers (2024-12-20T21:43:52Z) - LLM Roleplay: Simulating Human-Chatbot Interaction [52.03241266241294]
We propose a goal-oriented, persona-based method to automatically generate diverse multi-turn dialogues simulating human-chatbot interaction.
Our method can simulate human-chatbot dialogues with a high indistinguishability rate.
arXiv Detail & Related papers (2024-07-04T14:49:46Z) - Countering Malicious Content Moderation Evasion in Online Social
Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems.
This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.