Fugu-MT 論文翻訳(概要): Uncovering Code Insights: Leveraging GitHub Artifacts for Deeper Code Understanding

論文の概要: Uncovering Code Insights: Leveraging GitHub Artifacts for Deeper Code Understanding

arxiv url: http://arxiv.org/abs/2511.03549v1
Date: Wed, 05 Nov 2025 15:31:42 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-06 18:19:32.464129
Title: Uncovering Code Insights: Leveraging GitHub Artifacts for Deeper Code Understanding
Title（参考訳）: コードインサイト: より深いコード理解のためにGitHubの成果物を活用する
Authors: Ziv Nevo, Orna Raz, Karen Yorav,
Abstract要約: 大規模言語モデル(LLM)は、コード説明の生成において有望であることを示している。 GitHubの自然言語アーティファクトを活用する新しいアプローチを提案する。私たちのシステムは3つのコンポーネントで構成されています。ひとつはGitHubコンテキストの抽出と構造、もうひとつはコード目的の高レベルな説明を生成するためにこのコンテキストを使用する、もうひとつは説明を検証する第3のコンポーネントです。
参考スコア（独自算出の注目度）: 0.1358202049520503
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Understanding the purpose of source code is a critical task in software maintenance, onboarding, and modernization. While large language models (LLMs) have shown promise in generating code explanations, they often lack grounding in the broader software engineering context. We propose a novel approach that leverages natural language artifacts from GitHub -- such as pull request descriptions, issue descriptions and discussions, and commit messages -- to enhance LLM-based code understanding. Our system consists of three components: one that extracts and structures relevant GitHub context, another that uses this context to generate high-level explanations of the code's purpose, and a third that validates the explanation. We implemented this as a standalone tool, as well as a server within the Model Context Protocol (MCP), enabling integration with other AI-assisted development tools. Our main use case is that of enhancing a standard LLM-based code explanation with code insights that our system generates. To evaluate explanations' quality, we conducted a small scale user study, with developers of several open projects, as well as developers of proprietary projects. Our user study indicates that when insights are generated they often are helpful and non trivial, and are free from hallucinations.
Abstract（参考訳）: ソースコードの目的を理解することは、ソフトウェアのメンテナンス、オンボーディング、モダナイゼーションにおいて重要なタスクである。大規模言語モデル(LLM)は、コード説明の生成において有望であることを示しているが、より広範なソフトウェア工学の文脈における基盤を欠いていることが多い。 LLMベースのコード理解を強化するために、GitHubの自然言語アーティファクト(プルリクエスト記述、イシュー記述、議論、コミットメッセージなど)を活用する新しいアプローチを提案する。私たちのシステムは3つのコンポーネントで構成されています。ひとつはGitHubコンテキストの抽出と構造、もうひとつはコード目的の高レベルな説明を生成するためにこのコンテキストを使用する、もうひとつは説明を検証する第3のコンポーネントです。私たちはこれをスタンドアロンのツールとして実装し、モデルコンテキストプロトコル(MCP)内のサーバとして実装し、他のAI支援開発ツールとの統合を可能にしました。私たちの主なユースケースは、システムが生成するコードインサイトを使って、標準のLCMベースのコード説明を強化することです。説明の質を評価するため,いくつかのオープンプロジェクトの開発者とともに,プロプライエタリプロジェクトの開発者とともに,小規模なユーザスタディを実施しました。私たちのユーザ調査は、洞察が生成されると、しばしば役に立ち、非自明であり、幻覚がないことを示唆しています。

論文の概要: Uncovering Code Insights: Leveraging GitHub Artifacts for Deeper Code Understanding

関連論文リスト