Fugu-MT 論文翻訳(概要): Post-Training Local LLM Agents for Linux Privilege Escalation with Verifiable Rewards

論文の概要: Post-Training Local LLM Agents for Linux Privilege Escalation with Verifiable Rewards

arxiv url: http://arxiv.org/abs/2603.17673v1
Date: Wed, 18 Mar 2026 12:52:54 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-19 18:32:57.696398
Title: Post-Training Local LLM Agents for Linux Privilege Escalation with Verifiable Rewards
Title（参考訳）: Linuxプリビレージエスカレーションのための評価後ローカルLSMエージェントの検証
Authors: Philipp Normann, Andreas Happe, Jürgen Cito, Daniel Arp,
Abstract要約: LLMエージェントは、脆弱性発見のような研究領域にますます関係している。しかし、最強のシステムは依然としてクローズドでクラウドのみであり、リソース集約的で再現が難しく、プロプライエタリなコードや機密データに関わる作業には適さない。本稿では,厳格な資源予算の下でセキュリティタスクを実行できる,小規模でローカルなモデルを開発するための2段階のポストトレーニングパイプラインを提案する。
参考スコア（独自算出の注目度）: 2.631069233394708
License: http://creativecommons.org/licenses/by/4.0/
Abstract: LLM agents are increasingly relevant to research domains such as vulnerability discovery. Yet, the strongest systems remain closed and cloud-only, making them resource-intensive, difficult to reproduce, and unsuitable for work involving proprietary code or sensitive data. Consequently, there is an urgent need for small, local models that can perform security tasks under strict resource budgets, but methods for developing them remain underexplored. In this paper, we address this gap by proposing a two-stage post-training pipeline. We focus on the problem of Linux privilege escalation, where success is automatically verifiable and the task requires multi-step interactive reasoning. Using an experimental setup that prevents data leakage, we post-train a 4B model in two stages: supervised fine-tuning on traces from procedurally generated privilege-escalation environments, followed by reinforcement learning with verifiable rewards. On a held-out benchmark of 12 Linux privilege-escalation scenarios, supervised fine-tuning alone more than doubles the baseline success rate at 20 rounds, and reinforcement learning further lifts our resulting model, PrivEsc-LLM, to 95.8%, nearly matching Claude Opus 4.6 at 97.5%. At the same time, the expected inference cost per successful escalation is reduced by over 100x.
Abstract（参考訳）: LLMエージェントは、脆弱性発見のような研究領域にますます関係している。しかし、最強のシステムは依然としてクローズドでクラウドのみであり、リソース集約的で再現が難しく、プロプライエタリなコードや機密データに関わる作業には適さない。その結果、厳格なリソース予算の下でセキュリティタスクを実行できる小さなローカルモデルが緊急に必要となるが、それらを開発する方法はまだ未検討のままである。本稿では,2段階のポストトレーニングパイプラインの提案により,このギャップに対処する。我々は、成功が自動的に検証され、タスクは多段階の対話的推論を必要とするLinux特権エスカレーションの問題に焦点を当てる。データ漏洩を防止するための実験装置を用いて、手続き的に生成された特権エスカレーション環境からの痕跡を教師付き微調整し、4Bモデルを2段階の訓練後、検証可能な報酬付き強化学習を行う。 12のLinux特権エスカレーションシナリオのベンチマークでは、教師付き微調整だけで20ラウンドでベースライン成功率を2倍以上にし、強化学習により、結果モデルであるPrivEsc-LLMを95.8%まで引き上げ、Clude Opus 4.6と97.5%にほぼ一致するようにした。同時に、成功エスカレーション当たりの予測推論コストを100倍以上に削減する。

論文の概要: Post-Training Local LLM Agents for Linux Privilege Escalation with Verifiable Rewards

関連論文リスト