Related papers: LProtector: An LLM-driven Vulnerability Detection System

Related papers

RealSec-bench: A Benchmark for Evaluating Secure Code Generation in Real-World Repositories [58.32028251925354]
Large Language Models (LLMs) have demonstrated remarkable capabilities in code generation, but their proficiency in producing secure code remains a critical, under-explored area.<n>We introduce RealSec-bench, a new benchmark for secure code generation meticulously constructed from real-world, high-risk Java repositories.
arXiv Detail & Related papers (2026-01-30T08:29:01Z)
Ensembling Large Language Models for Code Vulnerability Detection: An Empirical Evaluation [69.8237598448941]
This study investigates the potential of ensemble learning to enhance the performance of Large Language Models (LLMs) in source code vulnerability detection.<n>We propose Dynamic Gated Stacking (DGS), a Stacking variant tailored for vulnerability detection.
arXiv Detail & Related papers (2025-09-16T03:48:22Z)
Secure Tug-of-War (SecTOW): Iterative Defense-Attack Training with Reinforcement Learning for Multimodal Model Security [63.41350337821108]
We propose Secure Tug-of-War (SecTOW) to enhance the security of multimodal large language models (MLLMs)<n>SecTOW consists of two modules: a defender and an auxiliary attacker, both trained iteratively using reinforcement learning (GRPO)<n>We show that SecTOW significantly improves security while preserving general performance.
arXiv Detail & Related papers (2025-07-29T17:39:48Z)
Guiding AI to Fix Its Own Flaws: An Empirical Study on LLM-Driven Secure Code Generation [16.29310628754089]
Large Language Models (LLMs) have become powerful tools for automated code generation.<n>LLMs often overlook critical security practices, which can result in the generation of insecure code.<n>This paper examines their inherent tendencies to produce insecure code, their capability to generate secure code when guided by self-generated vulnerability hints, and their effectiveness in repairing vulnerabilities when provided with different levels of feedback.
arXiv Detail & Related papers (2025-06-28T23:24:33Z)
Training Language Models to Generate Quality Code with Program Analysis Feedback [66.0854002147103]
Code generation with large language models (LLMs) is increasingly adopted in production but fails to ensure code quality.<n>We propose REAL, a reinforcement learning framework that incentivizes LLMs to generate production-quality code.
arXiv Detail & Related papers (2025-05-28T17:57:47Z)
VulBinLLM: LLM-powered Vulnerability Detection for Stripped Binaries [4.1417640577742425]
Vul-BinLLM is a framework for binary vulnerability detection using Large Language Models.<n>Vul-BinLLM mirrors traditional binary analysis with fine-grained optimizations in decompilation and vulnerability reasoning with an extended context.<n>Our evaluations show that Vul-BinLLM is highly effective in detecting vulnerabilities on the compiled Juliet dataset.
arXiv Detail & Related papers (2025-05-28T06:17:56Z)
Output Constraints as Attack Surface: Exploiting Structured Generation to Bypass LLM Safety Mechanisms [0.9091225937132784]
We reveal a critical control-plane attack surface to traditional data-plane vulnerabilities. We introduce Constrained Decoding Attack, a novel jailbreak class that weaponizes structured output constraints to bypass safety mechanisms. Our findings identify a critical security blind spot in current LLM architectures and urge a paradigm shift in LLM safety to address control-plane vulnerabilities.
arXiv Detail & Related papers (2025-03-31T15:08:06Z)
Global Challenge for Safe and Secure LLMs Track 1 [57.08717321907755]
The Global Challenge for Safe and Secure Large Language Models (LLMs) is a pioneering initiative organized by AI Singapore (AISG) and the CyberSG R&D Programme Office (CRPO) This paper introduces the Global Challenge for Safe and Secure Large Language Models (LLMs), a pioneering initiative organized by AI Singapore (AISG) and the CyberSG R&D Programme Office (CRPO) to foster the development of advanced defense mechanisms against automated jailbreaking attacks.
arXiv Detail & Related papers (2024-11-21T08:20:31Z)
SeCodePLT: A Unified Platform for Evaluating the Security of Code GenAI [58.29510889419971]
Existing benchmarks for evaluating the security risks and capabilities of code-generating large language models (LLMs) face several key limitations.<n>We introduce a general and scalable benchmark construction framework that begins with manually validated, high-quality seed examples and expands them via targeted mutations.<n>Applying this framework to Python, C/C++, and Java, we build SeCodePLT, a dataset of more than 5.9k samples spanning 44 CWE-based risk categories and three security capabilities.
arXiv Detail & Related papers (2024-10-14T21:17:22Z)
HexaCoder: Secure Code Generation via Oracle-Guided Synthetic Training Data [60.75578581719921]
Large language models (LLMs) have shown great potential for automatic code generation. Recent studies highlight that many LLM-generated code contains serious security vulnerabilities. We introduce HexaCoder, a novel approach to enhance the ability of LLMs to generate secure codes.
arXiv Detail & Related papers (2024-09-10T12:01:43Z)
Vul-RAG: Enhancing LLM-based Vulnerability Detection via Knowledge-level RAG [14.11780946647832]
Vul-RAG is a novel vulnerability detection technique based on knowledge-level retrieval-augmented generation (RAG) Vul-RAG constructs a vulnerability knowledge base by extracting multi-dimension knowledge via LLMs from existing CVE instances. Vul-RAG retrieves the relevant vulnerability knowledge from the constructed knowledge base based on functional semantics. Our evaluation of Vul-RAG on our constructed benchmark PairVul shows that Vul-RAG substantially outperforms all baselines by 12.96%/110% relative improvement in accuracy/pairwise-accuracy.
arXiv Detail & Related papers (2024-06-17T02:25:45Z)
M2CVD: Enhancing Vulnerability Semantic through Multi-Model Collaboration for Code Vulnerability Detection [52.4455893010468]
Large Language Models (LLMs) have strong capabilities in code comprehension, but fine-tuning costs and semantic alignment issues limit their project-specific optimization. Code models such CodeBERT are easy to fine-tune, but it is often difficult to learn vulnerability semantics from complex code languages. This paper introduces the Multi-Model Collaborative Vulnerability Detection approach (M2CVD) to improve the detection accuracy of code models.
arXiv Detail & Related papers (2024-06-10T00:05:49Z)
Security Vulnerability Detection with Multitask Self-Instructed Fine-Tuning of Large Language Models [8.167614500821223]
We introduce MSIVD, multitask self-instructed fine-tuning for vulnerability detection, inspired by chain-of-thought prompting and LLM self-instruction. Our experiments demonstrate that MSIVD achieves superior performance, outperforming the highest LLM-based vulnerability detector baseline (LineVul) with a F1 score of 0.92 on the BigVul dataset, and 0.48 on the PreciseBugs dataset.
arXiv Detail & Related papers (2024-06-09T19:18:05Z)
Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning [16.54022485688803]
VulLLM is a novel framework that integrates multi-task learning with Large Language Models (LLMs) to effectively mine deep-seated vulnerability features. The experiments conducted on six large datasets demonstrate that VulLLM surpasses seven state-of-the-art models in terms of effectiveness, generalization, and robustness.
arXiv Detail & Related papers (2024-06-06T03:29:05Z)
CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion [117.178835165855]
This paper introduces CodeAttack, a framework that transforms natural language inputs into code inputs. Our studies reveal a new and universal safety vulnerability of these models against code input. We find that a larger distribution gap between CodeAttack and natural language leads to weaker safety generalization.
arXiv Detail & Related papers (2024-03-12T17:55:38Z)
The Vulnerability Is in the Details: Locating Fine-grained Information of Vulnerable Code Identified by Graph-based Detectors [33.395068754566935]
VULEXPLAINER is a tool for locating vulnerability-critical code lines from coarse-level vulnerable code snippets. It can flag the vulnerability-triggering code statements with an accuracy of around 90% against eight common C/C++ vulnerabilities.
arXiv Detail & Related papers (2024-01-05T10:15:04Z)
How Far Have We Gone in Vulnerability Detection Using Large Language Models [15.09461331135668]
We introduce a comprehensive vulnerability benchmark VulBench. This benchmark aggregates high-quality data from a wide range of CTF challenges and real-world applications. We find that several LLMs outperform traditional deep learning approaches in vulnerability detection.
arXiv Detail & Related papers (2023-11-21T08:20:39Z)
Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs [59.596335292426105]
This paper collects the first open-source dataset to evaluate safeguards in large language models. We train several BERT-like classifiers to achieve results comparable with GPT-4 on automatic safety evaluation.
arXiv Detail & Related papers (2023-08-25T14:02:12Z)
Red Teaming Language Model Detectors with Language Models [114.36392560711022]
Large language models (LLMs) present significant safety and ethical risks if exploited by malicious users. Recent works have proposed algorithms to detect LLM-generated text and protect LLMs. We study two types of attack strategies: 1) replacing certain words in an LLM's output with their synonyms given the context; 2) automatically searching for an instructional prompt to alter the writing style of the generation.
arXiv Detail & Related papers (2023-05-31T10:08:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.