Fugu-MT 論文翻訳(概要): Advancing Multimodal Agent Reasoning with Long-Term Neuro-Symbolic Memory

論文の概要: Advancing Multimodal Agent Reasoning with Long-Term Neuro-Symbolic Memory

arxiv url: http://arxiv.org/abs/2603.15280v1
Date: Mon, 16 Mar 2026 13:43:22 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-17 18:28:58.393336
Title: Advancing Multimodal Agent Reasoning with Long-Term Neuro-Symbolic Memory
Title（参考訳）: 長期記憶を用いたマルチモーダルエージェント推論
Authors: Rongjie Jiang, Jianwei Wang, Gengda Zhao, Chengyang Luo, Kai Wang, Wenjie Zhang,
Abstract要約: NS-Mem(NS-Mem)は、マルチモーダルエージェント推論の進歩を目的とした長期記憶フレームワークである。実世界のマルチモーダル推論ベンチマークの実験では、Neural-Symbolic Memoryは全体の推論精度を平均4.35%改善している。
参考スコア（独自算出の注目度）: 7.934989469945716
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in large language models have driven the emergence of intelligent agents operating in open-world, multimodal environments. To support long-term reasoning, such agents are typically equipped with external memory systems. However, most existing multimodal agent memories rely primarily on neural representations and vector-based retrieval, which are well-suited for inductive, intuitive reasoning but fundamentally limited in supporting analytical, deductive reasoning critical for real-world decision making. To address this limitation, we propose NS-Mem, a long-term neuro-symbolic memory framework designed to advance multimodal agent reasoning by integrating neural memory with explicit symbolic structures and rules. Specifically, NS-Mem is operated around three core components of a memory system: (1) a three-layer memory architecture that consists episodic layer, semantic layer and logic rule layer, (2) a memory construction and maintenance mechanism implemented by SK-Gen that automatically consolidates structured knowledge from accumulated multimodal experiences and incrementally updates both neural representations and symbolic rules, and (3) a hybrid memory retrieval mechanism that combines similarity-based search with deterministic symbolic query functions to support structured reasoning. Experiments on real-world multimodal reasoning benchmarks demonstrate that Neural-Symbolic Memory achieves an average 4.35% improvement in overall reasoning accuracy over pure neural memory systems, with gains of up to 12.5% on constrained reasoning queries, validating the effectiveness of NS-Mem.
Abstract（参考訳）: 大規模言語モデルの最近の進歩は、オープンワールド、マルチモーダル環境で動作するインテリジェントエージェントの出現を促している。長期的な推論をサポートするため、このようなエージェントは通常、外部メモリシステムを備えている。しかし、既存のほとんどのマルチモーダルエージェント記憶は、主に神経表現とベクトルベースの検索に依存しており、帰納的、直観的推論に適しているが、実世界の意思決定に不可欠な分析的、演能的推論をサポートするのに根本的に制限されている。この制限に対処するために、ニューラルネットワークと明示的なシンボル構造と規則を統合することで、マルチモーダルエージェント推論を促進するために設計された、長期にわたるニューラルシンボリックメモリフレームワークNS-Memを提案する。具体的には,(1)エピソード層,セマンティック層,ロジックルール層から構成される3層メモリアーキテクチャ,(2)蓄積されたマルチモーダル体験から構造化知識を自動的に統合し,ニューラルネットワークとシンボリックルールの両方を漸進的に更新するSK-Genによって実装されたメモリ構築および保守機構,(3)類似性に基づく検索と決定論的シンボリッククエリ機能を組み合わせたハイブリッドメモリ検索機構,の3つのコアコンポーネントを中心に動作する。実世界のマルチモーダル推論ベンチマークの実験では、ニューラル・シンボリックメモリは純粋なニューラルメモリシステムよりも平均4.35%改善され、制約付き推論クエリで最大12.5%向上し、NS-Memの有効性が検証された。

論文の概要: Advancing Multimodal Agent Reasoning with Long-Term Neuro-Symbolic Memory

関連論文リスト