Fugu-MT 論文翻訳(概要): Feedback Loops and Code Perturbations in LLM-based Software Engineering: A Case Study on a C-to-Rust Translation System

論文の概要: Feedback Loops and Code Perturbations in LLM-based Software Engineering: A Case Study on a C-to-Rust Translation System

arxiv url: http://arxiv.org/abs/2512.02567v1
Date: Tue, 02 Dec 2025 09:38:20 GMT
ステータス: 翻訳完了
システム内更新日: 2025-12-03 21:04:45.809442
Title: Feedback Loops and Code Perturbations in LLM-based Software Engineering: A Case Study on a C-to-Rust Translation System
Title（参考訳）: LLMソフトウェア工学におけるフィードバックループとコード摂動--C-to-Rust翻訳システムの事例-
Authors: Martin Weiss, Jesko Hecking-Harbusch, Jochen Quante, Matthias Woehrle,
Abstract要約: 自動C-to-Rust翻訳システムにおける3変数の効果について検討する。その結果,LLM選択のフィードバックループがなければ,翻訳成功に大きな影響を及ぼすことがわかった。また、コード摂動によってもたらされる多様性によってシステム性能が向上する可能性があることも確認した。
参考スコア（独自算出の注目度）: 1.2566563622834341
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The advent of strong generative AI has a considerable impact on various software engineering tasks such as code repair, test generation, or language translation. While tools like GitHub Copilot are already in widespread use in interactive settings, automated approaches require a higher level of reliability before being usable in industrial practice. In this paper, we focus on three aspects that directly influence the quality of the results: a) the effect of automated feedback loops, b) the choice of Large Language Model (LLM), and c) the influence of behavior-preserving code changes. We study the effect of these three variables on an automated C-to-Rust translation system. Code translation from C to Rust is an attractive use case in industry due to Rust's safety guarantees. The translation system is based on a generate-and-check pattern, in which Rust code generated by the LLM is automatically checked for compilability and behavioral equivalence with the original C code. For negative checking results, the LLM is re-prompted in a feedback loop to repair its output. These checks also allow us to evaluate and compare the respective success rates of the translation system when varying the three variables. Our results show that without feedback loops LLM selection has a large effect on translation success. However, when the translation system uses feedback loops the differences across models diminish. We observe this for the average performance of the system as well as its robustness under code perturbations. Finally, we also identify that diversity provided by code perturbations can even result in improved system performance.
Abstract（参考訳）: 強力な生成AIの出現は、コード修復、テスト生成、言語翻訳など、さまざまなソフトウェアエンジニアリングタスクに大きな影響を与えている。 GitHub Copilotのようなツールは、すでにインタラクティブな設定で広く使われているが、自動化されたアプローチでは、工業的プラクティスで使えるようになる前に、より高いレベルの信頼性を必要とする。本稿では,結果の質に直接影響を与える3つの側面に焦点を当てる。 a) 自動フィードバックループの効果 b)Large Language Model(LLM)の選択及び c) 行動保存コードの変更の影響自動C-to-Rust翻訳システムにおけるこれらの3変数の効果について検討する。 CからRustへのコード変換は、Rustの安全性を保証するため、業界で魅力的なユースケースである。翻訳システムは、ジェネレーション・アンド・チェックパターンに基づいており、LLMによって生成されたRustコードが、元のCコードとコンパイル可能性および振る舞いの等価性のために自動的にチェックされる。負のチェック結果に対して、LLMはフィードバックループに再ジャンプして出力を修復する。これらのチェックは,3変数の異なる翻訳システムの成功率を評価・比較することを可能にする。その結果,LLM選択のフィードバックループがなければ,翻訳成功に大きな影響を及ぼすことがわかった。しかし、翻訳システムがフィードバックループを使用すると、モデル間の差は減少する。コード摂動下でのシステムの平均性能とロバスト性を観察する。最後に、コード摂動によってもたらされる多様性によってシステム性能が向上する可能性があることも確認します。

論文の概要: Feedback Loops and Code Perturbations in LLM-based Software Engineering: A Case Study on a C-to-Rust Translation System

関連論文リスト