Fugu-MT 論文翻訳(概要): GRID: Graph Representation of Intelligence Data for Security Text Knowledge Graph Construction

論文の概要: GRID: Graph Representation of Intelligence Data for Security Text Knowledge Graph Construction

arxiv url: http://arxiv.org/abs/2605.16714v1
Date: Fri, 15 May 2026 23:54:01 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-19 17:57:46.920474
Title: GRID: Graph Representation of Intelligence Data for Security Text Knowledge Graph Construction
Title（参考訳）: GRID:セキュリティテキスト知識グラフ構築のためのインテリジェンスデータのグラフ表現
Authors: Liangyi Huang, Zichen Liu, Fei Shao, Shang Ma, Mengshi Zhang, Zihao Chen, Yanfang Ye, Xusheng Xiao,
Abstract要約: セキュリティテキスト知識グラフ構築のためのエンドツーエンドフレームワークであるGRID(Graph Representation of Intelligence Data)を提案する。 GRIDはまず、トレース可能な記事-グラフアライメントを作成することで、CTIの記事からセキュリティドメインの監視を構築する。 Task-bankの報酬はオフラインで構築でき、後のトレーニング実行で再利用できる。
参考スコア（独自算出の注目度）: 27.237633809729743
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Security knowledge graphs can provide computable external memory for security agents, but constructing them from long-form cyber threat intelligence (CTI) remains difficult: LLMs often lack grounded security-domain knowledge, and end-to-end document-to-graph training is hard to supervise with cheap, stable rewards. We present GRID (Graph Representation of Intelligence Data), an end-to-end framework for security text knowledge graph construction. GRID first builds security-domain supervision from CTI articles by creating traceable article-graph alignments through graph extraction and knowledge-graph-conditioned text revision. It then turns document-to-graph learning into a scripted task bank combining four-option multi-select questions with triple-level regex matching targets, yielding more stable task-specific rewards than repeatedly scoring full graph outputs with an LLM judge. Using this supervision pipeline, we train two Qwen3-4B-Instruct-2507-based 4B extractors: a primary Task-bank Reward model and a secondary End2End Reward model with LLM-as-judge precision/recall rewards. On 249 CTI articles from GRID, CASIE, CTINexus, MalKG, and SecureNLP, the Task-bank Reward model with the ontology-guided GRID extraction pipeline reaches 84.62% source-averaged precision, 64.91% source-averaged recall, and 68.53% Avg F1, achieving the best source-averaged recall and near-top Avg F1 with lower token usage and deployment cost. The End2End Reward model reaches 76.91% precision, 53.85% recall, and 58.06% Avg F1. Further analyses show that task-bank rewards can be built once offline and reused across later post-training runs, outperforming online End2End LLM-as-judge reward and weaker alternatives such as Choice-only Reward and End2End SFT without RL.
Abstract（参考訳）: セキュリティ知識グラフは、セキュリティエージェントに計算可能な外部メモリを提供することができるが、長期的なサイバー脅威インテリジェンス(CTI)からそれらを構築することは難しい。セキュリティテキスト知識グラフ構築のためのエンドツーエンドフレームワークであるGRID(Graph Representation of Intelligence Data)を提案する。 GRIDはまず、グラフ抽出と知識グラフ条件付きテキストリビジョンを通じてトレース可能な記事-グラフアライメントを作成することで、CTIの記事からセキュリティドメインの監視を構築する。そして、文書からグラフへの学習をスクリプト化されたタスクバンクに変換し、4つのオプションの複数選択された質問と3段階のregexマッチングターゲットを組み合わせる。この監視パイプラインを用いて、2つのQwen3-4B-Instruct-2507ベースの4B抽出器(プライマリタスクバンク・リワードモデルと、LCM-as-judge精度/リコール報酬付きエンド2エンド・リワードモデル)を訓練する。 GRID、CASIE、CTINexus、MalKG、SecureNLPの249のCTI記事において、オントロジー誘導GRID抽出パイプラインを備えたタスクバンク・リワードモデルが84.62%のソース平均精度、64.91%のソース平均リコール、68.53%のAvg F1に達し、トークン使用率とデプロイコストが低下した。 End2End Rewardモデルは76.91%の精度、53.85%のリコール、58.06%のAvg F1に達する。オンラインのEnd2End LLM-as-judge報酬よりも優れており、Choice-only RewardやEnd2End SFTのような弱い代替品よりもRLを使わずに優れている。

論文の概要: GRID: Graph Representation of Intelligence Data for Security Text Knowledge Graph Construction

関連論文リスト