Fugu-MT 論文翻訳(概要): Imandra CodeLogician: Neuro-Symbolic Reasoning for Precise Analysis of Software Logic

論文の概要: Imandra CodeLogician: Neuro-Symbolic Reasoning for Precise Analysis of Software Logic

arxiv url: http://arxiv.org/abs/2601.11840v1
Date: Sat, 17 Jan 2026 00:16:41 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-21 22:47:22.339633
Title: Imandra CodeLogician: Neuro-Symbolic Reasoning for Precise Analysis of Software Logic
Title（参考訳）: Imandra CodeLogician:ソフトウェア論理の精密解析のためのニューロシンボリック推論
Authors: Hongyu Lin, Samer Abdallah, Makar Valentinov, Paul Brennan, Elijah Kagan, Christoph M. Wintersteiger, Denis Ignatovich, Grant Passmore,
Abstract要約: 大きな言語モデル(LLM)は、コード理解タスクに強いパフォーマンスを示しています。 LLMには、プログラムの振る舞いに関する正確で徹底的な数学的推論を行う能力がない。本稿では,ImandraXと統合されたソフトウェア論理の精密解析のためのニューロシンボリックエージェントであるCodeLogicianについて述べる。
参考スコア（独自算出の注目度）: 23.59512682324697
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) have shown strong performance on code understanding tasks, yet they fundamentally lack the ability to perform precise, exhaustive mathematical reasoning about program behavior. Existing benchmarks either focus on mathematical proof automation, largely disconnected from real-world software, or on engineering tasks that do not require semantic rigor. We present CodeLogician, a neurosymbolic agent for precise analysis of software logic, integrated with ImandraX, an industrial automated reasoning engine deployed in financial markets and safety-critical systems. Unlike prior approaches that use formal methods primarily to validate LLM outputs, CodeLogician uses LLMs to construct explicit formal models of software systems, enabling automated reasoning to answer rich semantic questions beyond binary verification outcomes. To rigorously evaluate mathematical reasoning about software logic, we introduce code-logic-bench, a benchmark targeting the middle ground between theorem proving and software engineering benchmarks. It measures reasoning correctness about program state spaces, control flow, coverage constraints, and edge cases, with ground truth defined via formal modeling and region decomposition. Comparing LLM-only reasoning against LLMs augmented with CodeLogician, formal augmentation yields substantial improvements, closing a 41-47 percentage point gap in reasoning accuracy. These results demonstrate that neurosymbolic integration is essential for scaling program analysis toward rigorous, autonomous software understanding.
Abstract（参考訳）: 大規模言語モデル(LLM)は、コード理解タスクに強いパフォーマンスを示してきたが、プログラムの振る舞いに関する正確で徹底的な数学的推論を実行する能力は基本的に欠如している。既存のベンチマークでは、実世界のソフトウェアから大きく切り離された数学的証明自動化や、意味論的厳密さを必要としないエンジニアリングタスクに焦点が当てられている。我々は,ソフトウェアロジックを正確に分析するための神経象徴的エージェントであるCodeLogicianを,金融市場や安全クリティカルシステムに展開する産業用自動推論エンジンであるImandraXと統合した。形式的手法を主にLCMの出力を検証する以前のアプローチとは異なり、CodeLogicianはLSMを使ってソフトウェアシステムの明示的な形式的モデルを構築し、自動推論によってバイナリ検証結果以上のリッチな意味論に答えることができる。ソフトウェア論理に関する数学的推論を厳格に評価するために,定理証明とソフトウェア工学ベンチマークの中間点を対象としたベンチマークであるCode-logic-benchを導入する。プログラム状態空間、制御フロー、カバレッジ制約、エッジケースに関する推論正当性を測定する。 CodeLogician と拡張された LLM に対する LLM のみの推論と比較すると、形式的な拡張は大幅に改善され、推論精度において41-47 のポイントギャップを閉じる。これらの結果は、プログラム分析を厳密で自律的なソフトウェア理解へと拡張するために、ニューロシンボリック統合が不可欠であることを証明している。

論文の概要: Imandra CodeLogician: Neuro-Symbolic Reasoning for Precise Analysis of Software Logic

関連論文リスト