Fugu-MT 論文翻訳(概要): REFN: A Reinforcement-Learning-From-Network Framework against 1-day/n-day Exploitations

論文の概要: REFN: A Reinforcement-Learning-From-Network Framework against 1-day/n-day Exploitations

arxiv url: http://arxiv.org/abs/2508.10701v1
Date: Thu, 14 Aug 2025 14:45:45 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-15 22:24:48.361381
Title: REFN: A Reinforcement-Learning-From-Network Framework against 1-day/n-day Exploitations
Title（参考訳）: REFN: 1日/1日の爆発に対する強化学習ネットワークフレームワーク
Authors: Tianlong Yu, Lihong Liu, Ziyi Zhou, Fudu Xing, Kailong Wang, Yang Yang,
Abstract要約: 本稿では,Large Language Models (LLM) を訓練し,ネットワークフィルタを自律的に生成し,1日ないしn日のエクスプロイトを防止する新しいフレームワークであるREFNを紹介する。 REFNは、従来のヒューマンフィードバックではなく、オンラインネットワーク報酬によって駆動される強化学習(RL)を独自に採用することによって、スケーラビリティを保証する。 REFNは有効性(代替品よりも21.1%高い精度)、効率性(平均時間で3.65時間)、スケーラビリティ(簡単に10Kデバイスにスケールできる)を示す。
参考スコア（独自算出の注目度）: 4.675306665285266
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The exploitation of 1 day or n day vulnerabilities poses severe threats to networked devices due to massive deployment scales and delayed patching (average Mean Time To Patch exceeds 60 days). Existing defenses, including host based patching and network based filtering, are inadequate due to limited scalability across diverse devices, compatibility issues especially with embedded or legacy systems, and error prone deployment process (manual patch validation). To address these issues, we introduce REFN (Reinforcement Learning From Network), a novel framework that trains Large Language Models (LLMs) to autonomously generate network filters to prevent 1 day or n day exploitations. REFN ensures scalability by uniquely employs Reinforcement Learning (RL) driven by online network rewards instead of traditional Human Feedback (RLHF). REFN guarantees compatibility via unified deployment on edge security gateways (Amazon Eero). REFN provides robustness via online validation using real network traffic. Crucially, REFN addresses three core challenges in training LLMs for exploit prevention: 1) expanding current LLMs limited vulnerability fixing expertise via Agentic RAG based Knowledge Distillation, 2) bridging current LLMs language to network gaps through an RL From VNF Pipeline that translates language context (vulnerability description) into network enforcement, 3) addressing the LLM hallucination and non determinism via the Online Agentic Validation that penalizes erroneous outputs. Evaluated across 22 families of 1 day or n day exploits, REFN demonstrates effectiveness (21.1 percent higher accuracy than alternatives), efficiency (Mean Time To Patch of 3.65 hours) and scalability (easily scale to 10K devices). REFN serves as an initial step toward training LLMs to rapidly prevent massive scale 1 day or n day exploitations.
Abstract（参考訳）: 1日またはn日の脆弱性の悪用は、大規模なデプロイメントスケールと遅延パッチ(平均平均時間とパッチは60日を超えている)のため、ネットワーク化されたデバイスに深刻な脅威をもたらす。ホストベースのパッチやネットワークベースのフィルタリングを含む既存のディフェンスは、さまざまなデバイス間でのスケーラビリティの制限、特に組み込みシステムやレガシーシステムにおける互換性の問題、エラーによるデプロイメントプロセス(手動のパッチ検証)など、不適切である。これらの問題に対処するために,大規模言語モデル(LLM)を訓練し,ネットワークフィルタを自律的に生成し,1日ないしn日のエクスプロイトを防止する新しいフレームワークであるREFN(Reinforcement Learning From Network)を紹介した。 REFNは、従来のヒューマンフィードバック(RLHF)ではなく、オンラインネットワーク報酬によって駆動される強化学習(RL)を独自に採用することによって、スケーラビリティを保証する。 REFNは、エッジセキュリティゲートウェイ(Amazon Eero)への統合デプロイメントによる互換性を保証する。 REFNは、実際のネットワークトラフィックを使用したオンライン検証を通じて堅牢性を提供する。重要な点として、REFNはLLMを悪用するためのトレーニングにおける3つのコア課題に対処している。 1) エージェントRAGをベースとした知識蒸留を通じて、現在のLLMを限定的な脆弱性修正の専門知識に拡張すること。 2) 言語コンテキスト(脆弱性記述)をネットワーク実行に変換するVNF PipelineからのRLを通じて,現在のLLM言語をネットワークギャップにブリッジする。 3) 不正なアウトプットを罰するオンラインエージェント検証を通じて, LLM幻覚と非決定性に対処すること。 1日またはn日のエクスプロイトの22のファミリーで評価され、REFNは有効性(代替品よりも21.1%高い精度)、効率性(平均時間で3.65時間)、スケーラビリティ(簡単に10Kデバイスにスケールできる)を示している。 REFNは、大規模な1日またはn日の搾取を迅速に防止するために、LSMを訓練する最初のステップとして機能する。

論文の概要: REFN: A Reinforcement-Learning-From-Network Framework against 1-day/n-day Exploitations

関連論文リスト