Fugu-MT 論文翻訳(概要): Hierarchical Embedding Fusion for Retrieval-Augmented Code Generation

論文の概要: Hierarchical Embedding Fusion for Retrieval-Augmented Code Generation

arxiv url: http://arxiv.org/abs/2603.06593v1
Date: Wed, 04 Feb 2026 14:56:11 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-15 16:38:22.402607
Title: Hierarchical Embedding Fusion for Retrieval-Augmented Code Generation
Title（参考訳）: 階層型埋め込み融合による検索コード生成
Authors: Nikita Sorokin, Ivan Sedykh, Valentin Malykh,
Abstract要約: コード補完のためのリポジトリ表現のための2段階のアプローチである階層埋め込み融合(HEF)を提案する。 HEFはスニペットベースの検索ベースラインに匹敵する正確なマッチング精度を達成する。グラフベースおよび反復検索システムと比較して、HEFは中央値のエンドツーエンドのレイテンシを13倍から26倍に削減する。
参考スコア（独自算出の注目度）: 6.4453302264198165
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Retrieval-augmented code generation often conditions the decoder on large retrieved code snippets. This ties online inference cost to repository size and introduces noise from long contexts. We present Hierarchical Embedding Fusion (HEF), a two-stage approach to repository representation for code completion. First, an offline cache compresses repository chunks into a reusable hierarchy of dense vectors using a small fuser model. Second, an online interface maps a small number of retrieved vectors into learned pseudo-tokens that are consumed by the code generator. This replaces thousands of retrieved tokens with a fixed pseudo-token budget while preserving access to repository-level information. On RepoBench and RepoEval, HEF with a 1.8B-parameter pipeline achieves exact-match accuracy comparable to snippet-based retrieval baselines, while operating at sub-second median latency on a single A100 GPU. Compared to graph-based and iterative retrieval systems in our experimental setup, HEF reduces median end-to-end latency by 13 to 26 times. We also introduce a utility-weighted likelihood signal for filtering training contexts and report ablation studies on pseudo-token budget, embedding models, and robustness to harmful retrieval. Overall, these results indicate that hierarchical dense caching is an effective mechanism for low-latency, repository-aware code completion.
Abstract（参考訳）: 検索拡張されたコード生成は、多くの場合、大規模な検索されたコードスニペットでデコーダを条件にしている。これは、オンライン推論コストをリポジトリのサイズに結び付け、長いコンテキストからノイズを導入する。コード補完のためのリポジトリ表現のための2段階のアプローチである階層埋め込み融合(HEF)を提案する。まず、オフラインキャッシュは、小さなファウザーモデルを用いて、リポジトリチャンクを高密度ベクトルの再利用可能な階層に圧縮する。第二に、オンラインインターフェースは、少数の検索されたベクトルを、コードジェネレータが消費する学習された擬似トークンにマッピングする。これは、取得した数千のトークンを固定された擬似トークンの予算で置き換え、リポジトリレベルの情報へのアクセスを保存する。 RepoBenchとRepoEvalでは、1.8Bパラメータパイプラインを備えたHEFは、スニペットベースの検索ベースラインに匹敵する正確なマッチング精度を達成し、単一のA100 GPU上で秒以下の中央値レイテンシで動作させる。実験装置におけるグラフベースおよび反復検索システムと比較して,HEFは中央値のエンドツーエンド遅延を13倍から26倍に削減する。また、トレーニングコンテキストをフィルタリングするためのユーティリティ重み付き確率信号を導入し、疑似トークン予算、埋め込みモデル、有害な検索に対する堅牢性に関するアブレーション研究を報告する。これらの結果は,階層的な高密度キャッシュが低レイテンシでリポジトリ対応のコード補完に有効なメカニズムであることを示唆している。

論文の概要: Hierarchical Embedding Fusion for Retrieval-Augmented Code Generation

関連論文リスト