Fugu-MT 論文翻訳(概要): QiMeng-NeuComBack: Self-Evolving Translation from IR to Assembly Code

論文の概要: QiMeng-NeuComBack: Self-Evolving Translation from IR to Assembly Code

arxiv url: http://arxiv.org/abs/2511.01183v1
Date: Mon, 03 Nov 2025 03:20:26 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-05 16:37:27.097684
Title: QiMeng-NeuComBack: Self-Evolving Translation from IR to Assembly Code
Title（参考訳）: QiMeng-NeuComBack:IRからアセンブリコードへの自己進化翻訳
Authors: Hainan Fang, Yuanbo Wen, Jun Bi, Yihan Wang, Tonghui He, Yanlin Tang, Di Huang, Jiaming Guo, Rui Zhang, Qi Guo, Yunji Chen,
Abstract要約: 大規模言語モデル(LLM)は、ニューラルコンパイルという魅力的な新しいパラダイムを提供する。本稿では,IR-to-assemblyコンパイル用に設計された新しいベンチマークデータセットであるNeuComBackを紹介する。 LLMの内部的なプロンプト戦略を進化させる自己進化的プロンプト最適化法を提案する。
参考スコア（独自算出の注目度）: 52.66657751895655
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Compilers, while essential, are notoriously complex systems that demand prohibitively expensive human expertise to develop and maintain. The recent advancements in Large Language Models (LLMs) offer a compelling new paradigm: Neural Compilation, which could potentially simplify compiler development for new architectures and facilitate the discovery of innovative optimization techniques. However, several critical obstacles impede its practical adoption. Firstly, a significant lack of dedicated benchmarks and robust evaluation methodologies hinders objective assessment and tracking of progress in the field. Secondly, systematically enhancing the reliability and performance of LLM-generated assembly remains a critical challenge. Addressing these challenges, this paper introduces NeuComBack, a novel benchmark dataset specifically designed for IR-to-assembly compilation. Leveraging this dataset, we first define a foundational Neural Compilation workflow and conduct a comprehensive evaluation of the capabilities of recent frontier LLMs on Neural Compilation, establishing new performance baselines. We further propose a self-evolving prompt optimization method that enables LLMs to iteratively evolve their internal prompt strategies by extracting insights from prior self-debugging traces, thereby enhancing their neural compilation capabilities. Experiments demonstrate that our method significantly improves both the functional correctness and the performance of LLM-generated assembly code. Compared to baseline prompts, the functional correctness rates improved from 44% to 64% on x86_64 and from 36% to 58% on aarch64, respectively. More significantly, among the 16 correctly generated x86_64 programs using our method, 14 (87.5%) surpassed clang-O3 performance.
Abstract（参考訳）: コンパイラーは必須だが、開発と維持のために非常に高価な人間の専門知識を要求する、非常に複雑なシステムである。ニューラルコンパイル(Neural Compilation)は、新しいアーキテクチャのコンパイラ開発を単純化し、革新的な最適化テクニックの発見を容易にする。しかし、いくつかの重大な障害が実用化を妨げた。第一に、専用のベンチマークとロバストな評価手法の欠如は、この分野の進歩の客観的評価と追跡を妨げる。第2に、LCM生成アセンブリの信頼性と性能を体系的に向上することは、依然として重要な課題である。本稿では,IR-to-assemblyコンパイル用に設計された新しいベンチマークデータセットであるNeuComBackを紹介する。このデータセットを活用して、まず基礎となるニューラルコンパイルワークフローを定義し、ニューラルコンパイルにおける最近のフロンティアLSMの機能の包括的な評価を行い、新しいパフォーマンスベースラインを確立する。さらに,従来の自己デバッグトレースから洞察を抽出することにより,LLMが内部のプロンプト戦略を反復的に進化させることのできる自己進化的プロンプト最適化手法を提案する。実験により,本手法はLLM生成アセンブリコードの機能的正しさと性能の両方を著しく改善することが示された。ベースラインプロンプトと比較して, x86_64では44%から64%に,aarch64では36%から58%に改善した。さらに,本手法を用いた16のx86_64プログラムのうち,14(87.5%)がclang-O3性能を上回った。

論文の概要: QiMeng-NeuComBack: Self-Evolving Translation from IR to Assembly Code

関連論文リスト