Related papers: Understanding Security Risks of AI Agents' Dependency Updates

Understanding Security Risks of AI Agents' Dependency Updates

URL: http://arxiv.org/abs/2601.00205v1
Date: Thu, 01 Jan 2026 04:44:18 GMT
Title: Understanding Security Risks of AI Agents' Dependency Updates
Authors: Tanmay Singla, Berk Çakar, Paschal C. Amusuo, James C. Davis,
Abstract summary: Dependency changes can substantially alter a project's security posture.<n>We study 117,062 dependency changes from agent- and human-authored pull requests across seven ecosystems.<n>Agent-driven dependency work yields a net vulnerability increase of 98, whereas human-authored work yields a net reduction of 1,316.
Score: 8.978334182646465
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Package dependencies are a critical control point in modern software supply chains. Dependency changes can substantially alter a project's security posture. As AI coding agents increasingly modify software via pull requests, it is unclear whether their dependency decisions introduce distinct security risks. We study 117,062 dependency changes from agent- and human-authored pull requests across seven ecosystems. Agents select known-vulnerable versions more often than humans (2.46% vs. 1.64%), and their vulnerable selections are more disruptive to remediate, with 36.8% requiring major-version upgrades compared to 12.9% for humans, despite patched alternatives existing in most cases. At the aggregate level, agent-driven dependency work yields a net vulnerability increase of 98, whereas human-authored work yields a net reduction of 1,316. These findings motivate pull-request-time vulnerability screening and registry-aware guardrails to make agent-driven dependency updates safer.

Related papers

OMNI-LEAK: Orchestrator Multi-Agent Network Induced Data Leakage [59.3826294523924]
We investigate the security vulnerabilities of a popular multi-agent pattern known as the orchestrator setup.<n>We report the susceptibility of frontier models to different categories of attacks, finding that both reasoning and non-reasoning models are vulnerable.
arXiv Detail & Related papers (2026-02-13T21:32:32Z)
Security in the Age of AI Teammates: An Empirical Study of Agentic Pull Requests on GitHub [4.409447722044799]
This study aims to characterize how autonomous coding agents contribute to software security in practice.<n>We conduct a large-scale empirical analysis of agent-authored PRs using the AIDev dataset.<n>We then analyze prevalence, acceptance outcomes, and review latency across autonomous agents, programming ecosystems, and types of code changes.
arXiv Detail & Related papers (2026-01-01T21:14:11Z)
Breaking Agent Backbones: Evaluating the Security of Backbone LLMs in AI Agents [36.2255033141489]
AI agents powered by large language models (LLMs) are being deployed at scale, yet we lack a systematic understanding of how the choice of backbone LLM affects agent security.<n>We introduce threat snapshots: a framework that isolates specific states in an agent's execution flow where vulnerabilities manifest.<n>We apply this framework to construct the $operatornameb3$ benchmark, a security benchmark based on 194331 unique crowdsourced adversarial attacks.
arXiv Detail & Related papers (2025-10-26T10:36:42Z)
AdvEvo-MARL: Shaping Internalized Safety through Adversarial Co-Evolution in Multi-Agent Reinforcement Learning [78.5751183537704]
AdvEvo-MARL is a co-evolutionary multi-agent reinforcement learning framework that internalizes safety into task agents.<n>Rather than relying on external guards, AdvEvo-MARL jointly optimize attackers and defenders.
arXiv Detail & Related papers (2025-10-02T02:06:30Z)
Security Challenges in AI Agent Deployment: Insights from a Large Scale Public Competition [101.86739402748995]
We run the largest public red-teaming competition to date, targeting 22 frontier AI agents across 44 realistic deployment scenarios.<n>We build the Agent Red Teaming benchmark and evaluate it across 19 state-of-the-art models.<n>Our findings highlight critical and persistent vulnerabilities in today's AI agents.
arXiv Detail & Related papers (2025-07-28T05:13:04Z)
WebGuard: Building a Generalizable Guardrail for Web Agents [59.31116061613742]
WebGuard is the first dataset designed to support the assessment of web agent action risks.<n>It contains 4,939 human-annotated actions from 193 websites across 22 diverse domains.
arXiv Detail & Related papers (2025-07-18T18:06:27Z)
SafeMobile: Chain-level Jailbreak Detection and Automated Evaluation for Multimodal Mobile Agents [58.21223208538351]
This work explores the security issues surrounding mobile multimodal agents.<n>It attempts to construct a risk discrimination mechanism by incorporating behavioral sequence information.<n>It also designs an automated assisted assessment scheme based on a large language model.
arXiv Detail & Related papers (2025-07-01T15:10:00Z)
SafeAgent: Safeguarding LLM Agents via an Automated Risk Simulator [77.86600052899156]
Large Language Model (LLM)-based agents are increasingly deployed in real-world applications.<n>We propose AutoSafe, the first framework that systematically enhances agent safety through fully automated synthetic data generation.<n>We show that AutoSafe boosts safety scores by 45% on average and achieves a 28.91% improvement on real-world tasks.
arXiv Detail & Related papers (2025-05-23T10:56:06Z)
Information Retrieval Induced Safety Degradation in AI Agents [52.15553901577888]
This study investigates how expanding retrieval access affects model reliability, bias propagation, and harmful content generation.<n>Retrieval-enabled agents built on aligned LLMs often behave more unsafely than uncensored models without retrieval.<n>These findings underscore the need for robust mitigation strategies to ensure fairness and reliability in retrieval-enabled and increasingly autonomous AI systems.
arXiv Detail & Related papers (2025-05-20T11:21:40Z)
AgentVigil: Generic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents [54.29555239363013]
We propose a generic black-box fuzzing framework, AgentVigil, to automatically discover and exploit indirect prompt injection vulnerabilities.<n>We evaluate AgentVigil on two public benchmarks, AgentDojo and VWA-adv, where it achieves 71% and 70% success rates against agents based on o3-mini and GPT-4o.<n>We apply our attacks in real-world environments, successfully misleading agents to navigate to arbitrary URLs, including malicious sites.
arXiv Detail & Related papers (2025-05-09T07:40:17Z)
The Ripple Effect of Vulnerabilities in Maven Central: Prevalence, Propagation, and Mitigation Challenges [8.955037553566774]
We analyze the prevalence and impact of vulnerabilities within the Maven Central ecosystem using Common Vulnerabilities and Exposures data.<n>In our subsample of around 4 million releases, we found that while only about 1% of releases have direct vulnerabilities.<n>We also observed that the time taken to patch vulnerabilities, including those of high or critical severity, often spans several years.
arXiv Detail & Related papers (2025-04-05T13:45:27Z)
Pinning Is Futile: You Need More Than Local Dependency Versioning to Defend against Supply Chain Attacks [23.756533975349985]
Recent high-profile incidents in open-source software have raised practitioner attention on software supply chain attacks.<n>Security practitioners advocate pinning dependency to specific versions rather than floating in version ranges.<n>We quantify, through counterfactual analysis and simulations, the security and maintenance impact of version constraints in the npm ecosystem.
arXiv Detail & Related papers (2025-02-10T16:50:48Z)
Dependency Practices for Vulnerability Mitigation [4.710141711181836]
We analyze more than 450 vulnerabilities in the npm ecosystem to understand why dependent packages remain vulnerable. We identify over 200,000 npm packages that are infected through their dependencies. We use 9 features to build a prediction model that identifies packages that quickly adopt the vulnerability fix and prevent further propagation of vulnerabilities.
arXiv Detail & Related papers (2023-10-11T19:48:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.