Fugu-MT 論文翻訳(概要): Towards a Neural Debugger for Python

論文の概要: Towards a Neural Debugger for Python

arxiv url: http://arxiv.org/abs/2603.09951v1
Date: Tue, 10 Mar 2026 17:47:05 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-11 15:25:24.513219
Title: Towards a Neural Debugger for Python
Title（参考訳）: Pythonのニューラルデバッガを目指して
Authors: Maximilian Beck, Jonas Gehring, Jannik Kossen, Gabriel Synnaeve,
Abstract要約: Python実行上の大きな言語モデルのトレーニングは、それらをコード実行の基盤とします。これにより、全Pythonプログラムの行ごとの実行予測が可能になる。開発者は、プログラムをステップバイステップで実行することは滅多にない。代わりに、デバッガを使用して、特定のブレークポイントでの実行を停止し、プログラム変数を検査または修正しながら、関連する部分をステップスルーする。既存のニューラルインタプリタアプローチには、このようなインタラクティブな制御が欠けている。
参考スコア（独自算出の注目度）: 25.996925295693444
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Training large language models (LLMs) on Python execution traces grounds them in code execution and enables the line-by-line execution prediction of whole Python programs, effectively turning them into neural interpreters (FAIR CodeGen Team et al., 2025). However, developers rarely execute programs step by step; instead, they use debuggers to stop execution at certain breakpoints and step through relevant portions only while inspecting or modifying program variables. Existing neural interpreter approaches lack such interactive control. To address this limitation, we introduce neural debuggers: language models that emulate traditional debuggers, supporting operations such as stepping into, over, or out of functions, as well as setting breakpoints at specific source lines. We show that neural debuggers -- obtained via fine-tuning large LLMs or pre-training smaller models from scratch -- can reliably model both forward execution (predicting future states and outputs) and inverse execution (inferring prior states or inputs) conditioned on debugger actions. Evaluated on CruxEval, our models achieve strong performance on both output and input prediction tasks, demonstrating robust conditional execution modeling. Our work takes first steps towards future agentic coding systems in which neural debuggers serve as a world model for simulated debugging environments, providing execution feedback or enabling agents to interact with real debugging tools. This capability lays the foundation for more powerful code generation, program understanding, and automated debugging.
Abstract（参考訳）: Python実行上の大きな言語モデル(LLM)のトレーニングは、それらをコード実行の土台として、Pythonプログラム全体の行ごとの実行予測を可能にし、効果的にニューラルインタプリタ(FAIR CodeGen Team et al , 2025)に変換する。しかし、開発者はプログラムをステップバイステップで実行することは滅多にない。代わりにデバッガを使用して特定のブレークポイントでの実行を停止し、プログラム変数を検査したり修正したりするだけで、関連する部分をステップスルーする。既存のニューラルインタプリタアプローチには、このようなインタラクティブな制御が欠けている。この制限に対処するため、従来のデバッガをエミュレートする言語モデル、関数のステップイン、オーバー、アウトといった操作のサポート、特定のソースラインでのブレークポイントの設定など、ニューラルネットワークを導入しています。ニューラルデバッガ -- 大規模なLLMを微調整したり、スクラッチから小さなモデルを事前訓練することで -- は、デバッガアクションで条件付けられたフォワード実行(将来の状態と出力の予測)と逆実行(事前状態や入力の推論)の両方を確実にモデル化できることを示す。 CruxEvalをモデルとして評価し,出力および入力予測タスクにおいて高い性能を達成し,ロバストな条件付き実行モデリングを実証した。我々の研究は、ニューラルデバッガがデバッグ環境をシミュレートする世界モデルとして機能し、実行フィードバックを提供したり、エージェントが実際のデバッグツールと対話できるようにする未来のエージェントコーディングシステムへの第一歩を踏み出した。この機能は、より強力なコード生成、プログラム理解、自動デバッグの基礎となる。

論文の概要: Towards a Neural Debugger for Python

関連論文リスト