Fugu-MT 論文翻訳(概要): VulRTex: A Reasoning-Guided Approach to Identify Vulnerabilities from Rich-Text Issue Report

論文の概要: VulRTex: A Reasoning-Guided Approach to Identify Vulnerabilities from Rich-Text Issue Report

arxiv url: http://arxiv.org/abs/2509.03875v1
Date: Thu, 04 Sep 2025 04:26:45 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-05 20:21:10.048974
Title: VulRTex: A Reasoning-Guided Approach to Identify Vulnerabilities from Rich-Text Issue Report
Title（参考訳）: VulRTex: リッチテキストイシューレポートから脆弱性を識別するための推論ガイドによるアプローチ
Authors: Ziyou Jiang, Mingyang Li, Guowei Yang, Lin Shi, Qing Wang,
Abstract要約: VulRTexは、脆弱性関連IRをリッチテキスト情報で識別するための推論誘導型アプローチである。 VulRTexは脆弱性関連IRを特定し、CWE-IDを予測する上で最高のパフォーマンスを達成する。 VulRTexは、2024年に10の代表的なOSSプロジェクトにわたる30の新興脆弱性を特定するために適用されている。
参考スコア（独自算出の注目度）: 10.432632781646666
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Software vulnerabilities exist in open-source software (OSS), and the developers who discover these vulnerabilities may submit issue reports (IRs) to describe their details. Security practitioners need to spend a lot of time manually identifying vulnerability-related IRs from the community, and the time gap may be exploited by attackers to harm the system. Previously, researchers have proposed automatic approaches to facilitate identifying these vulnerability-related IRs, but these works focus on textual descriptions but lack the comprehensive analysis of IR's rich-text information. In this paper, we propose VulRTex, a reasoning-guided approach to identify vulnerability-related IRs with their rich-text information. In particular, VulRTex first utilizes the reasoning ability of the Large Language Model (LLM) to prepare the Vulnerability Reasoning Database with historical IRs. Then, it retrieves the relevant cases from the prepared reasoning database to generate reasoning guidance, which guides LLM to identify vulnerabilities by reasoning analysis on target IRs' rich-text information. To evaluate the performance of VulRTex, we conduct experiments on 973,572 IRs, and the results show that VulRTex achieves the highest performance in identifying the vulnerability-related IRs and predicting CWE-IDs when the dataset is imbalanced, outperforming the best baseline with +11.0% F1, +20.2% AUPRC, and +10.5% Macro-F1, and 2x lower time cost than baseline reasoning approaches. Furthermore, VulRTex has been applied to identify 30 emerging vulnerabilities across 10 representative OSS projects in 2024's GitHub IRs, and 11 of them are successfully assigned CVE-IDs, which illustrates VulRTex's practicality.
Abstract（参考訳）: ソフトウェア脆弱性はオープンソースソフトウェア(OSS)に存在し、これらの脆弱性を発見した開発者は、詳細を説明するために発行レポート(IR)を提出することができる。セキュリティ実践者は、コミュニティから脆弱性に関連するIRを手動で特定する必要がある。これまで、脆弱性関連IRの特定を容易にするための自動的なアプローチが提案されていたが、これらの研究はテキスト記述に焦点を当てているが、IRのリッチテキスト情報の包括的分析は欠如している。本稿では,脆弱性関連IRをリッチテキスト情報で識別するための推論誘導手法であるVulRTexを提案する。特に、VulRTexは、まずLarge Language Model(LLM)の推論能力を利用して、歴史的IRによる脆弱性推論データベースを作成する。そして、準備された推論データベースから関連事例を検索して推論ガイダンスを生成し、ターゲットIRのリッチテキスト情報に基づいて解析を行い、LSMに脆弱性の特定を誘導する。 VulRTexの性能を評価するために,973,572個のIRを用いて実験を行い,データセットの不均衡時の脆弱性関連IRの識別とCWE-IDの予測において,VulRTexが最高のベースラインである+11.0% F1,+20.2% AUPRC,+10.5% Macro-F1,+2.5% Macro-F1,および2倍の時間コストを達成していることを示す。さらに、VulRTexは2024年のGitHub IRで10の代表的なOSSプロジェクトに対して30の新たな脆弱性を特定するために適用され、そのうち11がCVE-IDの割り当てに成功しており、VulRTexの実用性を示している。

論文の概要: VulRTex: A Reasoning-Guided Approach to Identify Vulnerabilities from Rich-Text Issue Report

関連論文リスト