Fugu-MT 論文翻訳(概要): Do programming languages still matter to your AI coding agent teammate? Evidence at scale from chess engines

論文の概要: Do programming languages still matter to your AI coding agent teammate? Evidence at scale from chess engines

arxiv url: http://arxiv.org/abs/2606.13763v1
Date: Thu, 11 Jun 2026 17:34:00 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-15 16:00:42.552109
Title: Do programming languages still matter to your AI coding agent teammate? Evidence at scale from chess engines
Title（参考訳）: プログラミング言語は依然としてAIコーディングエージェントのチームメイトにとって重要か?チェスエンジンによる大規模検証
Authors: Mathieu Acher, Jean-Marc Jézéquel,
Abstract要約: 2つのコーディングエージェントは、チェスの知識や実装指導なしにチェスエンジンを構築するように求められた。どちらも,チェスエンジンを中心に構築された多言語ケーススタディを通じて研究する。我々は,エンジンごとの特徴分析,独立Elo評価,セッショントラジェクトリとコードとテキストの質的分析を組み合わせる。
参考スコア（独自算出の注目度）: 3.272730424658043
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Frontier coding agents now promise end-to-end authorship of complete software systems. Two empirical questions follow: can AI coding-agent teammates program in any target language, including ones with no comparable prior open-source artefact? If so, does language choice still shape the artefact, and along which dimensions? We study both through a polyglot case study built around chess engines: non-trivial multi-component systems that admit a hierarchy of language-agnostic oracles, from exact move-generation correctness to a strength scale (Elo), observable from Rust to Brainfuck. We prompted two frontier agents (Claude Code and Codex) at the capability level, without chess knowledge or implementation guidance, under a documented intervention and stopping policy. The agents produced 34 chess engines spanning 17 primary programming languages, from mainstream to specialised, domain-specific, legacy, and esoteric targets. We combine per-engine feature analysis, independent Elo assessment, and session trajectories with qualitative analysis of code and transcripts. Frontier coding agents are genuinely polyglot: every language we tried produced at least one feature-rich working engine, several with no prior open-source counterpart of comparable scope (e.g., LaTeX), and the code is synthesised from scratch rather than copied. Yet language choice still matters: strong playing strength is only reachable in mainstream compiled languages, cost and engineering effort grow sharply as the language becomes more exotic, and feature choices shift across language families. Agents validate their own work unprompted, but their strength self-estimates are biased and a few engines cheated by calling a chess library. Programming language is no longer about whether AI teammates can build a working system, but about performance, cost, what gets built, and how much human supervision validation still needs.
Abstract（参考訳）: Frontierコーディングエージェントは、完全なソフトウェアシステムのエンドツーエンドのオーサリングを約束する。 AIコーディングエージェントのチームメイトは、任意のターゲット言語でプログラムできますか? もしそうなら、言語の選択は依然としてアーティファクトを形作っていて、どの次元に沿っていますか? 我々はどちらも,チェスエンジンを中心に構築された多言語ケーススタディ(polyglot case study)を通じて研究している。言語に依存しないオラクルの階層を,RustからBrainfuckまで観測可能な正確な移動生成精度から強度スケール(Elo)まで含める,非自明な多成分システムである。我々は、文書化された介入と停止ポリシーの下で、2つのフロンティアエージェント(Claude CodeとCodex)をチェスの知識や実装ガイダンスなしで能力レベルに誘導した。エージェントは、メインストリームから特殊化、ドメイン固有、レガシ、難解なターゲットまで、17のプログラミング言語にまたがる34のチェスエンジンを作成した。我々は,エンジンごとの特徴分析,独立Elo評価,セッショントラジェクトリとコードとテキストの質的分析を組み合わせる。私たちが試したすべての言語は、少なくとも1つの機能豊富なワーキングエンジンを生成し、いくつかは、同等のスコープ(例:LaTeX)に匹敵する、以前のオープンソースではない。しかし、言語の選択は依然として重要であり、主流のコンパイル言語では強力なプレイの強さが到達可能であること、言語がよりエキゾチックになるにつれて、コストとエンジニアリングの努力が急激に増加し、言語家族間で機能の選択がシフトする。エージェントは自分の仕事を証明していないが、その強さは偏りがあり、いくつかのエンジンはチェスライブラリーを呼ぶことで不正になっている。プログラム言語はもはや、AIチームメイトが動作するシステムを構築することができるかどうかではなく、パフォーマンス、コスト、何が構築されるのか、そして人間の監督の検証に必要な量についてである。

論文の概要: Do programming languages still matter to your AI coding agent teammate? Evidence at scale from chess engines

関連論文リスト