An Empirical Study of Vulnerable Package Dependencies in LLM Repositories
- URL: http://arxiv.org/abs/2508.21417v1
- Date: Fri, 29 Aug 2025 08:38:58 GMT
- Title: An Empirical Study of Vulnerable Package Dependencies in LLM Repositories
- Authors: Shuhan Liu, Xing Hu, Xin Xia, David Lo, Xiaohu Yang,
- Abstract summary: Large language models (LLMs) rely on external code dependencies from package management systems.<n>Vulnerabilities in dependencies can expose LLMs to security risks.<n>Half of the vulnerabilities in the LLM ecosystem remain undisclosed for more than 56.2 months, significantly longer than those in the Python ecosystem.
- Score: 14.817045028745563
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have developed rapidly in recent years, revolutionizing various fields. Despite their widespread success, LLMs heavily rely on external code dependencies from package management systems, creating a complex and interconnected LLM dependency supply chain. Vulnerabilities in dependencies can expose LLMs to security risks. While existing research predominantly focuses on model-level security threats, vulnerabilities within the LLM dependency supply chain have been overlooked. To fill this gap, we conducted an empirical analysis of 52 open-source LLMs, examining their third-party dependencies and associated vulnerabilities. We then explored activities within the LLM repositories to understand how maintainers manage third-party vulnerabilities in practice. Finally, we compared third-party dependency vulnerabilities in the LLM ecosystem to those in the Python ecosystem. Our results show that half of the vulnerabilities in the LLM ecosystem remain undisclosed for more than 56.2 months, significantly longer than those in the Python ecosystem. Additionally, 75.8% of LLMs include vulnerable dependencies in their configuration files. This study advances the understanding of LLM supply chain risks, provides insights for practitioners, and highlights potential directions for improving the security of the LLM supply chain.
Related papers
- Understanding the Supply Chain and Risks of Large Language Model Applications [25.571274158366563]
We introduce the first comprehensive dataset for analyzing and benchmarking Large Language Models (LLMs) supply chain security.<n>We collect 3,859 real-world LLM applications and perform interdependency analysis, identifying 109,211 models, 2,474 datasets, and 9,862 libraries.<n>Our findings reveal deeply nested dependencies in LLM applications and significant vulnerabilities across the supply chain, underscoring the need for comprehensive security analysis.
arXiv Detail & Related papers (2025-07-24T05:30:54Z) - Look Before You Leap: Enhancing Attention and Vigilance Regarding Harmful Content with GuidelineLLM [53.79753074854936]
Large language models (LLMs) are increasingly vulnerable to emerging jailbreak attacks.<n>This vulnerability poses significant risks to real-world applications.<n>We propose a novel defensive paradigm called GuidelineLLM.
arXiv Detail & Related papers (2024-12-10T12:42:33Z) - Large Language Model Supply Chain: Open Problems From the Security Perspective [25.320736806895976]
Large Language Model (LLM) is changing the software development paradigm and has gained huge attention from both academia and industry.
We take the first step to discuss the potential security risks in each component as well as the integration between components of LLM SC.
arXiv Detail & Related papers (2024-11-03T15:20:21Z) - Lifting the Veil on Composition, Risks, and Mitigations of the Large Language Model Supply Chain [6.478930807409979]
Large language models (LLMs) have sparked significant impact with regard to both intelligence and productivity.<n>We develop a structured taxonomy encompassing risk types, risky actions, and corresponding mitigations across different stakeholders.
arXiv Detail & Related papers (2024-10-28T17:02:12Z) - Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs [60.32717556756674]
This paper introduces a systematic evaluation framework to assess Large Language Models in detecting cryptographic misuses.
Our in-depth analysis of 11,940 LLM-generated reports highlights that the inherent instabilities in LLMs can lead to over half of the reports being false positives.
The optimized approach achieves a remarkable detection rate of nearly 90%, surpassing traditional methods and uncovering previously unknown misuses in established benchmarks.
arXiv Detail & Related papers (2024-07-23T15:31:26Z) - Prompt Leakage effect and defense strategies for multi-turn LLM interactions [95.33778028192593]
Leakage of system prompts may compromise intellectual property and act as adversarial reconnaissance for an attacker.
We design a unique threat model which leverages the LLM sycophancy effect and elevates the average attack success rate (ASR) from 17.7% to 86.2% in a multi-turn setting.
We measure the mitigation effect of 7 black-box defense strategies, along with finetuning an open-source model to defend against leakage attempts.
arXiv Detail & Related papers (2024-04-24T23:39:58Z) - Uncovering Safety Risks of Large Language Models through Concept Activation Vector [13.804245297233454]
We introduce a Safety Concept Activation Vector (SCAV) framework to guide attacks on large language models (LLMs)<n>We then develop an SCAV-guided attack method that can generate both attack prompts and embedding-level attacks.<n>Our attack method significantly improves the attack success rate and response quality while requiring less training data.
arXiv Detail & Related papers (2024-04-18T09:46:25Z) - A New Era in LLM Security: Exploring Security Concerns in Real-World
LLM-based Systems [47.18371401090435]
We analyze the security of Large Language Model (LLM) systems, instead of focusing on the individual LLMs.
We propose a multi-layer and multi-step approach and apply it to the state-of-art OpenAI GPT4.
We found that although the OpenAI GPT4 has designed numerous safety constraints to improve its safety features, these safety constraints are still vulnerable to attackers.
arXiv Detail & Related papers (2024-02-28T19:00:12Z) - Benchmarking LLMs via Uncertainty Quantification [91.72588235407379]
The proliferation of open-source Large Language Models (LLMs) has highlighted the urgent need for comprehensive evaluation methods.
We introduce a new benchmarking approach for LLMs that integrates uncertainty quantification.
Our findings reveal that: I) LLMs with higher accuracy may exhibit lower certainty; II) Larger-scale LLMs may display greater uncertainty compared to their smaller counterparts; and III) Instruction-finetuning tends to increase the uncertainty of LLMs.
arXiv Detail & Related papers (2024-01-23T14:29:17Z) - Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language
Model Systems [29.828997665535336]
Large language models (LLMs) have strong capabilities in solving diverse natural language processing tasks.
However, the safety and security issues of LLM systems have become the major obstacle to their widespread application.
This paper proposes a comprehensive taxonomy, which systematically analyzes potential risks associated with each module of an LLM system.
arXiv Detail & Related papers (2024-01-11T09:29:56Z) - Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs [59.596335292426105]
This paper collects the first open-source dataset to evaluate safeguards in large language models.
We train several BERT-like classifiers to achieve results comparable with GPT-4 on automatic safety evaluation.
arXiv Detail & Related papers (2023-08-25T14:02:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.