Fugu-MT 論文翻訳(概要): StriderSPD: Structure-Guided Joint Representation Learning for Binary Security Patch Detection

論文の概要: StriderSPD: Structure-Guided Joint Representation Learning for Binary Security Patch Detection

arxiv url: http://arxiv.org/abs/2601.05772v1
Date: Fri, 09 Jan 2026 12:55:29 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-12 17:41:49.970726
Title: StriderSPD: Structure-Guided Joint Representation Learning for Binary Security Patch Detection
Title（参考訳）: StriderSPD:バイナリセキュリティパッチ検出のための構造誘導型共同表現学習
Authors: Qingyuan Li, Chenchen Yu, Chuanyi Li, Xin-Cheng Wen, Cheryl Lee, Cuiyun Gao, Bin Luo,
Abstract要約: セキュリティパッチ検出(SPD)は、ソフトウェア資産を保護する。ほとんどのSPD研究はオープンソースソフトウェア(OSS)をターゲットにしているが、実際のソフトウェアの大部分はクローズドソースである。グラフブランチを大きな言語モデルに統合するバイナリコードのフレームワークである textbftextitStriderSPD を提案する。
参考スコア（独自算出の注目度）: 22.120085662911194
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Vulnerabilities severely threaten software systems, making the timely application of security patches crucial for mitigating attacks. However, software vendors often silently patch vulnerabilities with limited disclosure, where Security Patch Detection (SPD) comes to protect software assets. Recently, most SPD studies have targeted Open-Source Software (OSS), yet a large portion of real-world software is closed-source, where patches are distributed as binaries without accessible source code. The limited binary SPD approaches often lift binaries to abstraction levels, i.e., assembly code or pseudo-code. However, assembly code is register-based instructions conveying limited semantics, while pseudo-code lacks parser-compatible grammar to extract structure, both hindering accurate vulnerability-fix representation learning. In addition, previous studies often obtain training and testing data from the same project for evaluation, which fails to reflect closed-source conditions. To alleviate the above challenges, we propose \textbf{\textit{StriderSPD}}, a \underline{Str}ucture-gu\underline{ide}d joint \underline{r}epresentation \underline{SPD} framework of binary code that integrates a graph branch into a large language model (LLM), leveraging structural information to guide the LLM in identifying security patches. Our novel design of the adapters in the graph branch effectively aligns the representations between assembly code and pseudo-code at the LLM's token level. We further present a two-stage training strategy to address the optimization imbalance caused by the large parameter disparity between StriderSPD's two branches, which enables proper branch fitting. To enable more realistic evaluation, we construct a binary SPD benchmark that is disjoint from prior datasets in both projects and domains and extensively evaluate StriderSPD on this benchmark.
Abstract（参考訳）: 脆弱性はソフトウェアシステムに深刻な脅威を与え、攻撃を緩和するためにはセキュリティパッチのタイムリーな適用が不可欠である。しかしながら、ソフトウェアベンダは、セキュリティパッチ検出(SPD)がソフトウェア資産を保護するために来る、限られた開示で脆弱性を静かにパッチすることが多い。最近のSPD研究はオープンソースソフトウェア(OSS)をターゲットにしているが、実際のソフトウェアの大部分はクローズドソースであり、パッチはソースコードにアクセスできないバイナリとして配布されている。制限されたバイナリSPDアプローチは、バイナリを抽象化レベル(アセンブリコードや擬似コードなど)に引き上げることが多い。しかし、アセンブリコードは限られた意味を伝達するレジスタベースの命令であり、擬似コードは構造を抽出するパーサ互換文法に欠けており、どちらも正確な脆弱性修正表現学習を妨げる。さらに、以前の研究では、同じプロジェクトからトレーニングデータとテストデータを取得して、クローズドソース条件を反映しない場合が多い。上記の課題を解決するために,グラフブランチを大規模言語モデル (LLM) に統合したバイナリコードで,セキュリティパッチの識別にLLMをガイドするための構造情報を活用するために,構造情報を活用する。 LLMのトークンレベルにおけるアセンブリコードと擬似コードとの表現を効果的に整合させる。さらに,StriderSPDの2つの分岐間のパラメータの相違が原因で生じる最適化の不均衡に対処する2段階のトレーニング戦略を提案する。より現実的な評価を可能にするために、プロジェクトとドメインの双方の以前のデータセットから切り離されたバイナリSPDベンチマークを構築し、このベンチマークでStriderSPDを広範囲に評価する。

論文の概要: StriderSPD: Structure-Guided Joint Representation Learning for Binary Security Patch Detection

関連論文リスト