The Rising Threat to Emerging AI-Powered Search Engines
- URL: http://arxiv.org/abs/2502.04951v1
- Date: Fri, 07 Feb 2025 14:15:46 GMT
- Title: The Rising Threat to Emerging AI-Powered Search Engines
- Authors: Zeren Luo, Zifan Peng, Yule Liu, Zhen Sun, Mingchen Li, Jingyi Zheng, Xinlei He,
- Abstract summary: We conduct the first safety risk quantification on seven production AIPSEs.
Our findings reveal that AIPSEs frequently generate harmful content that contains malicious URLs.
We develop an agent-based defense with a GPT-4o-based content refinement tool and an XGBoost-based URL detector.
- Score: 20.796363884152466
- License:
- Abstract: Recent advancements in Large Language Models (LLMs) have significantly enhanced the capabilities of AI-Powered Search Engines (AIPSEs), offering precise and efficient responses by integrating external databases with pre-existing knowledge. However, we observe that these AIPSEs raise risks such as quoting malicious content or citing malicious websites, leading to harmful or unverified information dissemination. In this study, we conduct the first safety risk quantification on seven production AIPSEs by systematically defining the threat model, risk level, and evaluating responses to various query types. With data collected from PhishTank, ThreatBook, and LevelBlue, our findings reveal that AIPSEs frequently generate harmful content that contains malicious URLs even with benign queries (e.g., with benign keywords). We also observe that directly query URL will increase the risk level while query with natural language will mitigate such risk. We further perform two case studies on online document spoofing and phishing to show the ease of deceiving AIPSEs in the real-world setting. To mitigate these risks, we develop an agent-based defense with a GPT-4o-based content refinement tool and an XGBoost-based URL detector. Our evaluation shows that our defense can effectively reduce the risk but with the cost of reducing available information. Our research highlights the urgent need for robust safety measures in AIPSEs.
Related papers
- Risks and NLP Design: A Case Study on Procedural Document QA [52.557503571760215]
We argue that clearer assessments of risks and harms to users will be possible when we specialize the analysis to more concrete applications and their plausible users.
We conduct a risk-oriented error analysis that could then inform the design of a future system to be deployed with lower risk of harm and better performance.
arXiv Detail & Related papers (2024-08-16T17:23:43Z) - EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.
Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.
However, the deployment of these agents in physical environments presents significant safety challenges.
This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - "Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models [74.05368440735468]
Retrieval-Augmented Generative (RAG) models enhance Large Language Models (LLMs)
In this paper, we demonstrate a security threat where adversaries can exploit the openness of these knowledge bases.
arXiv Detail & Related papers (2024-06-26T05:36:23Z) - Threat Modelling and Risk Analysis for Large Language Model (LLM)-Powered Applications [0.0]
Large Language Models (LLMs) have revolutionized various applications by providing advanced natural language processing capabilities.
This paper explores the threat modeling and risk analysis specifically tailored for LLM-powered applications.
arXiv Detail & Related papers (2024-06-16T16:43:58Z) - Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning [61.2224355547598]
Open-sourcing of large language models (LLMs) accelerates application development, innovation, and scientific progress.
Our investigation exposes a critical oversight in this belief.
By deploying carefully designed demonstrations, our research demonstrates that base LLMs could effectively interpret and execute malicious instructions.
arXiv Detail & Related papers (2024-04-16T13:22:54Z) - GUARD-D-LLM: An LLM-Based Risk Assessment Engine for the Downstream uses of LLMs [0.0]
This paper explores risks emanating from downstream uses of large language models (LLMs)
We introduce a novel LLM-based risk assessment engine (GUARD-D-LLM) designed to pinpoint and rank threats relevant to specific use cases derived from text-based user inputs.
Integrating thirty intelligent agents, this innovative approach identifies bespoke risks, gauges their severity, offers targeted suggestions for mitigation, and facilitates risk-aware development.
arXiv Detail & Related papers (2024-04-02T05:25:17Z) - Risk and Response in Large Language Models: Evaluating Key Threat Categories [6.436286493151731]
This paper explores the pressing issue of risk assessment in Large Language Models (LLMs)
By utilizing the Anthropic Red-team dataset, we analyze major risk categories, including Information Hazards, Malicious Uses, and Discrimination/Hateful content.
Our findings indicate that LLMs tend to consider Information Hazards less harmful, a finding confirmed by a specially developed regression model.
arXiv Detail & Related papers (2024-03-22T06:46:40Z) - Mapping LLM Security Landscapes: A Comprehensive Stakeholder Risk Assessment Proposal [0.0]
We propose a risk assessment process using tools like the risk rating methodology which is used for traditional systems.
We conduct scenario analysis to identify potential threat agents and map the dependent system components against vulnerability factors.
We also map threats against three key stakeholder groups.
arXiv Detail & Related papers (2024-03-20T05:17:22Z) - Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models [79.0183835295533]
We introduce the first benchmark for indirect prompt injection attacks, named BIPIA, to assess the risk of such vulnerabilities.
Our analysis identifies two key factors contributing to their success: LLMs' inability to distinguish between informational context and actionable instructions, and their lack of awareness in avoiding the execution of instructions within external content.
We propose two novel defense mechanisms-boundary awareness and explicit reminder-to address these vulnerabilities in both black-box and white-box settings.
arXiv Detail & Related papers (2023-12-21T01:08:39Z) - A Security Risk Taxonomy for Prompt-Based Interaction With Large Language Models [5.077431021127288]
This paper addresses a gap in current research by focusing on security risks posed by large language models (LLMs)
Our work proposes a taxonomy of security risks along the user-model communication pipeline and categorizes the attacks by target and attack type alongside the commonly used confidentiality, integrity, and availability (CIA) triad.
arXiv Detail & Related papers (2023-11-19T20:22:05Z) - On the Security Risks of Knowledge Graph Reasoning [71.64027889145261]
We systematize the security threats to KGR according to the adversary's objectives, knowledge, and attack vectors.
We present ROAR, a new class of attacks that instantiate a variety of such threats.
We explore potential countermeasures against ROAR, including filtering of potentially poisoning knowledge and training with adversarially augmented queries.
arXiv Detail & Related papers (2023-05-03T18:47:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.