Related papers: Design and Implementation of a Secure RAG-Enhanced AI Chatbot for Smart Tourism Customer Service: Defending Against Prompt Injection Attacks -- A Case Study of Hsinchu, Taiwan

Design and Implementation of a Secure RAG-Enhanced AI Chatbot for Smart Tourism Customer Service: Defending Against Prompt Injection Attacks -- A Case Study of Hsinchu, Taiwan

URL: http://arxiv.org/abs/2509.21367v1
Date: Mon, 22 Sep 2025 11:40:29 GMT
Title: Design and Implementation of a Secure RAG-Enhanced AI Chatbot for Smart Tourism Customer Service: Defending Against Prompt Injection Attacks -- A Case Study of Hsinchu, Taiwan
Authors: Yu-Kai Shih, You-Kai Kang,
Abstract summary: This paper presents the design and implementation of a secure retrieval-augmented generation (RAG) chatbots for Hsinchu smart tourism services.<n>The system integrates RAG with API function calls, multi-layered linguistic analysis, and guardrails against injections.<n> Evaluations with 674 adversarial prompts and 223 benign queries show over 95% accuracy on benign tasks and substantial detection of injection attacks.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: As smart tourism evolves, AI-powered chatbots have become indispensable for delivering personalized, real-time assistance to travelers while promoting sustainability and efficiency. However, these systems are increasingly vulnerable to prompt injection attacks, where adversaries manipulate inputs to elicit unintended behaviors such as leaking sensitive information or generating harmful content. This paper presents a case study on the design and implementation of a secure retrieval-augmented generation (RAG) chatbot for Hsinchu smart tourism services. The system integrates RAG with API function calls, multi-layered linguistic analysis, and guardrails against injections, achieving high contextual awareness and security. Key features include a tiered response strategy, RAG-driven knowledge grounding, and intent decomposition across lexical, semantic, and pragmatic levels. Defense mechanisms include system norms, gatekeepers for intent judgment, and reverse RAG text to prioritize verified data. We also benchmark a GPT-5 variant (released 2025-08-07) to assess inherent robustness. Evaluations with 674 adversarial prompts and 223 benign queries show over 95% accuracy on benign tasks and substantial detection of injection attacks. GPT-5 blocked about 85% of attacks, showing progress yet highlighting the need for layered defenses. Findings emphasize contributions to sustainable tourism, multilingual accessibility, and ethical AI deployment. This work offers a practical framework for deploying secure chatbots in smart tourism and contributes to resilient, trustworthy AI applications.

Related papers

ReasAlign: Reasoning Enhanced Safety Alignment against Prompt Injection Attack [52.17935054046577]
We present ReasAlign, a model-level solution to improve safety alignment against indirect prompt injection attacks.<n>ReasAlign incorporates structured reasoning steps to analyze user queries, detect conflicting instructions, and preserve the continuity of the user's intended tasks.
arXiv Detail & Related papers (2026-01-15T08:23:38Z)
It's a TRAP! Task-Redirecting Agent Persuasion Benchmark for Web Agents [52.81924177620322]
Web-based agents powered by large language models are increasingly used for tasks such as email management or professional networking.<n>Their reliance on dynamic web content makes them vulnerable to prompt injection attacks: adversarial instructions hidden in interface elements that persuade the agent to divert from its original task.<n>We introduce the Task-Redirecting Agent Persuasion Benchmark (TRAP), an evaluation for studying how persuasion techniques misguide autonomous web agents on realistic tasks.
arXiv Detail & Related papers (2025-12-29T01:09:10Z)
Exploring the Security Threats of Retriever Backdoors in Retrieval-Augmented Code Generation [17.62321354201344]
Retrieval-Augmented Code Generation (RACG) is increasingly adopted to enhance Large Language Models for software development.<n>This paper conducts the first systematic exploration of a critical and stealthy threat: backdoor attacks targeting the retriever component.
arXiv Detail & Related papers (2025-12-25T13:53:46Z)
Securing AI Agents Against Prompt Injection Attacks [0.0]
We present a benchmark for evaluating prompt injection risks in RAG-enabled AI agents.<n>Our framework reduces successful attack rates from 73.2% to 8.7% while maintaining 94.3% of baseline task performance.
arXiv Detail & Related papers (2025-11-19T10:00:54Z)
CHAI: Command Hijacking against embodied AI [41.9483699766073]
CHAI (Command Hijacking against embodied AI) is a new class of prompt-based attacks that exploit the multimodal language interpretation abilities of Large Visual-Language Models (LVLMs)<n> CHAI embeds deceptive natural language instructions, such as misleading signs, in visual input, systematically searches the token space, builds a dictionary of prompts, and guides an attacker model to generate Visual Attack Prompts.<n>We evaluate CHAI on four LVLM agents; drone emergency landing, autonomous driving, and aerial object tracking, and on a real robotic vehicle.
arXiv Detail & Related papers (2025-09-30T19:02:57Z)
Passive Hack-Back Strategies for Cyber Attribution: Covert Vectors in Denied Environment [0.2538209532048867]
This paper examines the strategic value of passive hack-back techniques that enable covert attribution and intelligence collection without initiating direct offensive actions.<n>Key vectors include tracking beacons, honeytokens, environment-specific payloads, and supply-chain-based traps embedded within exfiltrated or leaked assets.<n>The paper also explores the role of Artificial Intelligence (AI) in enhancing passive hack-back operations.
arXiv Detail & Related papers (2025-08-17T16:43:23Z)
AutoAdv: Automated Adversarial Prompting for Multi-Turn Jailbreaking of Large Language Models [0.0]
Large Language Models (LLMs) continue to exhibit vulnerabilities to jailbreaking attacks.<n>We present AutoAdv, a novel framework that automates adversarial prompt generation.<n>We show that our attacks achieve jailbreak success rates of up to 86% for harmful content generation.
arXiv Detail & Related papers (2025-04-18T08:38:56Z)
Temporal Context Awareness: A Defense Framework Against Multi-turn Manipulation Attacks on Large Language Models [0.0]
Large Language Models (LLMs) are increasingly vulnerable to sophisticated multi-turn manipulation attacks.<n>This paper introduces the Temporal Context Awareness framework, a novel defense mechanism designed to address this challenge.<n>Preliminary evaluations on simulated adversarial scenarios demonstrate the framework's potential to identify subtle manipulation patterns.
arXiv Detail & Related papers (2025-03-18T22:30:17Z)
Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions [50.40122190627256]
We introduce POATE, a novel jailbreak technique that harnesses contrastive reasoning to provoke unethical responses.<n>PoATE crafts semantically opposing intents and integrates them with adversarial templates, steering models toward harmful outputs with remarkable subtlety.<n>To counter this, we propose Intent-Aware CoT and Reverse Thinking CoT, which decompose queries to detect malicious intent and reason in reverse to evaluate and reject harmful responses.
arXiv Detail & Related papers (2025-01-03T15:40:03Z)
Poison Attacks and Adversarial Prompts Against an Informed University Virtual Assistant [3.0874677990361246]
Large language models (LLMs) are particularly vulnerable to adversarial attacks.<n>The rapid development pace of AI-based systems is being driven by the potential of Generative AI (GenAI) to assist humans in decision making.<n>A threat actor can use security gaps, poor safeguards, and limited data governance to carry out attacks that grant unauthorized access to the system and its data.
arXiv Detail & Related papers (2024-11-03T05:34:38Z)
Artificial Intelligence as the New Hacker: Developing Agents for Offensive Security [0.0]
This paper explores the integration of Artificial Intelligence (AI) into offensive cybersecurity. It develops an autonomous AI agent, ReaperAI, designed to simulate and execute cyberattacks. ReaperAI demonstrates the potential to identify, exploit, and analyze security vulnerabilities autonomously.
arXiv Detail & Related papers (2024-05-09T18:15:12Z)
On the Security Risks of Knowledge Graph Reasoning [71.64027889145261]
We systematize the security threats to KGR according to the adversary's objectives, knowledge, and attack vectors. We present ROAR, a new class of attacks that instantiate a variety of such threats. We explore potential countermeasures against ROAR, including filtering of potentially poisoning knowledge and training with adversarially augmented queries.
arXiv Detail & Related papers (2023-05-03T18:47:42Z)
Can AI-Generated Text be Reliably Detected? [50.95804851595018]
Large Language Models (LLMs) perform impressively well in various applications.<n>The potential for misuse of these models in activities such as plagiarism, generating fake news, and spamming has raised concern about their responsible use.<n>We stress-test the robustness of these AI text detectors in the presence of an attacker.
arXiv Detail & Related papers (2023-03-17T17:53:19Z)
Illusory Attacks: Information-Theoretic Detectability Matters in Adversarial Attacks [76.35478518372692]
We introduce epsilon-illusory, a novel form of adversarial attack on sequential decision-makers. Compared to existing attacks, we empirically find epsilon-illusory to be significantly harder to detect with automated methods. Our findings suggest the need for better anomaly detectors, as well as effective hardware- and system-level defenses.
arXiv Detail & Related papers (2022-07-20T19:49:09Z)
Adversarial vs behavioural-based defensive AI with joint, continual and active learning: automated evaluation of robustness to deception, poisoning and concept drift [62.997667081978825]
Recent advancements in Artificial Intelligence (AI) have brought new capabilities to behavioural analysis (UEBA) for cyber-security. In this paper, we present a solution to effectively mitigate this attack by improving the detection process and efficiently leveraging human expertise.
arXiv Detail & Related papers (2020-01-13T13:54:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.