Fugu-MT 論文翻訳(概要): Simple Baselines are Competitive with Code Evolution

論文の概要: Simple Baselines are Competitive with Code Evolution

arxiv url: http://arxiv.org/abs/2602.16805v1
Date: Wed, 18 Feb 2026 19:07:22 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-20 15:21:28.302432
Title: Simple Baselines are Competitive with Code Evolution
Title（参考訳）: 単純なベースラインはコード進化と競合する
Authors: Yonatan Gideoni, Sebastian Risi, Yarin Gal,
Abstract要約: 提案された多くのコード進化パイプラインは優れたパフォーマンスを示しているが、単純なベースラインと比較されないことが多い。より優れた数学的境界を見つけること、エージェント的な足場を設計すること、機械学習の競争である。
参考スコア（独自算出の注目度）: 30.40712969455345
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Code evolution is a family of techniques that rely on large language models to search through possible computer programs by evolving or mutating existing code. Many proposed code evolution pipelines show impressive performance but are often not compared to simpler baselines. We test how well two simple baselines do over three domains: finding better mathematical bounds, designing agentic scaffolds, and machine learning competitions. We find that simple baselines match or exceed much more sophisticated methods in all three. By analyzing these results we find various shortcomings in how code evolution is both developed and used. For the mathematical bounds, a problem's search space and domain knowledge in the prompt are chiefly what dictate a search's performance ceiling and efficiency, with the code evolution pipeline being secondary. Thus, the primary challenge in finding improved bounds is designing good search spaces, which is done by domain experts, and not the search itself. When designing agentic scaffolds we find that high variance in the scaffolds coupled with small datasets leads to suboptimal scaffolds being selected, resulting in hand-designed majority vote scaffolds performing best. We propose better evaluation methods that reduce evaluation stochasticity while keeping the code evolution economically feasible. We finish with a discussion of avenues and best practices to enable more rigorous code evolution in future work.
Abstract（参考訳）: コード進化(英: Code evolution)は、既存のコードを進化または変更することによって可能なコンピュータプログラムを探索するために、大きな言語モデルに依存する技法のファミリーである。提案された多くのコード進化パイプラインは優れたパフォーマンスを示しているが、単純なベースラインと比較されないことが多い。より優れた数学的境界を見つけること、エージェント的な足場を設計すること、機械学習の競争である。単純なベースラインは,これら3つすべてにおいて,はるかに高度な手法と一致しているか,あるいは超えているのです。これらの結果を分析することで、コードの進化と利用の両方において、さまざまな欠点が見つかります。数学的境界について、問題の探索空間とプロンプトにおけるドメイン知識は、主に、コード進化パイプラインが二次的な形で、探索のパフォーマンスの天井と効率を規定するものである。したがって、改良された境界を見つける上での最大の課題は、検索そのものではなく、ドメインの専門家によって行われる優れた検索空間を設計することである。エージェント的な足場を設計する際、足場内の高い分散と小さなデータセットが組み合わさると、最適な足場が選択され、手作りの多数決足場が最善であることがわかった。我々は,コード進化を経済的に維持しつつ,確率性を評価するためのより良い評価手法を提案する。我々は、将来の作業でより厳格なコード進化を可能にするために、道とベストプラクティスに関する議論を締めくくります。

論文の概要: Simple Baselines are Competitive with Code Evolution

関連論文リスト