Fugu-MT 論文翻訳(概要): VulnLLM-R: Specialized Reasoning LLM with Agent Scaffold for Vulnerability Detection

論文の概要: VulnLLM-R: Specialized Reasoning LLM with Agent Scaffold for Vulnerability Detection

arxiv url: http://arxiv.org/abs/2512.07533v1
Date: Mon, 08 Dec 2025 13:06:23 GMT
ステータス: 翻訳完了
システム内更新日: 2025-12-09 22:03:54.898623
Title: VulnLLM-R: Specialized Reasoning LLM with Agent Scaffold for Vulnerability Detection
Title（参考訳）: VulnLLM-R: 脆弱性検出のためのエージェントスキャフォールド付き特殊推論LDM
Authors: Yuzhou Nie, Hongwei Li, Chengquan Guo, Ruizhe Jiang, Zhun Wang, Bo Li, Dawn Song, Wenbo Guo,
Abstract要約: VulnLLM-R は脆弱性検出のための LLM を最優先の推論である。私たちは70億のパラメータを持つ推論モデルをトレーニングします。 VulnLLM-R は SOTA 静的解析ツールよりも有効性と効率が優れていることを示す。
参考スコア（独自算出の注目度）: 45.69684471143409
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose VulnLLM-R, the~\emph{first specialized reasoning LLM} for vulnerability detection. Our key insight is that LLMs can reason about program states and analyze the potential vulnerabilities, rather than simple pattern matching. This can improve the model's generalizability and prevent learning shortcuts. However, SOTA reasoning LLMs are typically ultra-large, closed-source, or have limited performance in vulnerability detection. To address this, we propose a novel training recipe with specialized data selection, reasoning data generation, reasoning data filtering and correction, and testing-phase optimization. Using our proposed methodology, we train a reasoning model with seven billion parameters. Through extensive experiments on SOTA datasets across Python, C/C++, and Java, we show that VulnLLM-R has superior effectiveness and efficiency than SOTA static analysis tools and both open-source and commercial large reasoning models. We further conduct a detailed ablation study to validate the key designs in our training recipe. Finally, we construct an agent scaffold around our model and show that it outperforms CodeQL and AFL++ in real-world projects. Our agent further discovers a set of zero-day vulnerabilities in actively maintained repositories. This work represents a pioneering effort to enable real-world, project-level vulnerability detection using AI agents powered by specialized reasoning models. The code is available at~\href{https://github.com/ucsb-mlsec/VulnLLM-R}{github}.
Abstract（参考訳）: 本稿では脆弱性検出のための〜\emph{first special reasoning LLM} である VulnLLM-R を提案する。私たちの重要な洞察は、LCMは単純なパターンマッチングではなく、プログラム状態について推論し、潜在的な脆弱性を分析することができるということです。これにより、モデルの一般化性が向上し、学習ショートカットが防止される。しかし、SOTA推論LSMは一般的に超大型でクローズドソースであり、脆弱性検出の性能は限られている。そこで本研究では,特殊なデータ選択,推論データ生成,推論データフィルタリングと修正,テストフェーズ最適化を備えた新しいトレーニングレシピを提案する。提案手法を用いて,70億のパラメータを持つ推論モデルを訓練する。 Python、C/C++、JavaにわたるSOTAデータセットに関する広範な実験を通して、VulnLLM-Rは、SOTA静的解析ツールやオープンソースおよび商用の大規模推論モデルよりも有効性と効率が優れていることを示した。さらに、トレーニングレシピの重要な設計を検証するために、詳細なアブレーション研究を実施しています。最後に、モデルの周りにエージェントの足場を構築し、実際のプロジェクトでのCodeQLとAFL++よりも優れていることを示す。私たちのエージェントは、アクティブにメンテナンスされたリポジトリにおいて、ゼロデイ脆弱性のセットをさらに発見します。この研究は、特殊な推論モデルを利用したAIエージェントを使用して、現実のプロジェクトレベルの脆弱性検出を可能にする、先駆的な取り組みである。コードは~\href{https://github.com/ucsb-mlsec/VulnLLM-R}{github}で入手できる。

論文の概要: VulnLLM-R: Specialized Reasoning LLM with Agent Scaffold for Vulnerability Detection

関連論文リスト