Fugu-MT 論文翻訳(概要): Springdrift: An Auditable Persistent Runtime for LLM Agents with Case-Based Memory, Normative Safety, and Ambient Self-Perception

論文の概要: Springdrift: An Auditable Persistent Runtime for LLM Agents with Case-Based Memory, Normative Safety, and Ambient Self-Perception

arxiv url: http://arxiv.org/abs/2604.04660v1
Date: Mon, 06 Apr 2026 13:14:37 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-07 15:49:19.198069
Title: Springdrift: An Auditable Persistent Runtime for LLM Agents with Case-Based Memory, Normative Safety, and Ambient Self-Perception
Title（参考訳）: Springdrift: ケースベース記憶, 規範的安全性, 環境的自己認識を備えたLLMエージェントのための聴取型永続化ランタイム
Authors: Seamus Brady,
Abstract要約: 本稿では、長期LLMエージェントの永続ランタイムであるSpringdriftを紹介する。我々は,このカテゴリに人工リテーナという用語を導入する。これは、システム設計とデプロイメントのケーススタディに関する技術的なレポートであり、ベンチマークによる評価ではない。
参考スコア（独自算出の注目度）: 0.20305676256390928
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present Springdrift, a persistent runtime for long-lived LLM agents. The system integrates an auditable execution substrate (append-only memory, supervised processes, git-backed recovery), a case-based reasoning memory layer with hybrid retrieval (evaluated against a dense cosine baseline), a deterministic normative calculus for safety gating with auditable axiom trails, and continuous ambient self-perception via a structured self-state representation (the sensorium) injected each cycle without tool calls. These properties support behaviours difficult to achieve in session-bounded systems: cross-session task continuity, cross-channel context maintenance, end-to-end forensic reconstruction of decisions, and self-diagnostic behaviour. We report on a single-instance deployment over 23 days (19 operating days), during which the agent diagnosed its own infrastructure bugs, classified failure modes, identified an architectural vulnerability, and maintained context across email and web channels -- without explicit instruction. We introduce the term Artificial Retainer for this category: a non-human system with persistent memory, defined authority, domain-specific autonomy, and forensic accountability in an ongoing relationship with a specific principal -- distinguished from software assistants and autonomous agents, drawing on professional retainer relationships and the bounded autonomy of trained working animals. This is a technical report on a systems design and deployment case study, not a benchmark-driven evaluation. Evidence is from a single instance with a single operator, presented as illustration of what these architectural properties can support in practice. Implemented in approximately Gleam on Erlang/OTP. Code, artefacts, and redacted operational logs will be available at https://github.com/seamus-brady/springdrift upon publication.
Abstract（参考訳）: 本稿では、長期LLMエージェントの永続ランタイムであるSpringdriftを紹介する。このシステムは、監査可能な実行基板(アペンドオンメモリ、教師付きプロセス、gitバックアップされたリカバリ)、ハイブリッド検索(密度の高いコサインベースラインに対して評価される)を備えたケースベースの推論メモリ層、監査可能な公理トレイルで安全にゲーティングするための決定論的規範計算、そしてツールコールなしで各サイクルに注入された構造化自己状態表現(セシウム)を介して連続的な環境自己認識を統合している。これらの特性はセッションバウンドシステムでは達成し難い振る舞いをサポートする: セッション間タスク継続性、チャンネル間コンテキストの保守、決定のエンドツーエンドの法医学的再構築、自己診断的行動。エージェントが自身のインフラストラクチャのバグを診断し、障害モードを分類し、アーキテクチャ上の脆弱性を特定し、EメールとWebチャネル間のコンテキストを明示的な命令なしで維持する、23日間(19の運用日)にわたる単一インスタンスのデプロイについて報告した。永続的な記憶、定義された権限、ドメイン固有の自律性、および特定のプリンシパルとの継続的な関係における法医学的説明責任を持つ非人間的システムである。これは、システム設計とデプロイメントのケーススタディに関する技術的なレポートであり、ベンチマークによる評価ではない。エビデンス(Evidence)は、単一のオペレータを持つ単一のインスタンスから、これらのアーキテクチャプロパティが実際に何をサポートできるかの図示として提示される。 Erlang/OTPのGleamに実装されている。コード、アーティファクト、再実行された運用ログは、公開時にhttps://github.com/seamus-brady/springdriftで入手できる。

関連論文リスト

Dynamic analysis enhances issue resolution [53.50448142467294]
DAIRA(Dynamic Analysis-enhanced Issue Resolution Agent)は、エージェントの推論サイクルに動的解析を組み込む自動修復フレームワークである。テストトレース駆動の方法論によって駆動されるDAIRAは、軽量モニタを使用して重要なランタイムデータを抽出する。 Gemini 3 Flash Previewを使用すると、DAIRAは新たな最先端(SOTA)パフォーマンスを確立し、SWE-bench Verifiedデータセットで79.4%の解像度を達成する。
論文参考訳（メタデータ） (2026-03-23T14:48:54Z)
Reasoning Provenance for Autonomous AI Agents: Structured Behavioral Analytics Beyond State Checkpoints and Execution Traces [0.0]
Agent Execution Record (AER) は構造化された推論プリミティブであり、すべてのステップで第一級クエリ可能なフィールドとしてインテント、観察、推論をキャプチャする。 AERが集団レベルの行動分析を可能にする方法を示す: 推論パターンマイニング、信頼度校正、クロスエージェント比較、モックリプレイによる反事実回帰テスト。
論文参考訳（メタデータ） (2026-03-23T08:27:54Z)
A Trace-Based Assurance Framework for Agentic AI Orchestration: Contracts, Testing, and Governance [0.22940141855172028]
本稿では,Large Language Models (LLM) を用いたエージェントAIシステムの保証フレームワークを提案する。実行は、明示的なステップとトレースコントラクトを備えたメッセージ・アクション・トレース(MAT)として実装される。このフレームワークは、有界摂動に対する予算付き反例探索として定式化されたストレステストを含む。
論文参考訳（メタデータ） (2026-03-18T10:23:48Z)
ReqToCode: Embedding Requirements Traceability as a Structural Property of the Codebase [0.0]
本稿では,トレース可能なシステム要素を直接システムに埋め込むことによって,トレースの劣化を防止する手法であるReqToCodeを紹介する。アプローチ、アーキテクチャ原則、トレーサブルライフサイクルを説明し、要求定義、アーティファクト生成、コード統合、ビルド時の検証を対象とする一般的な例で説明します。
論文参考訳（メタデータ） (2026-03-14T16:00:09Z)
Automated Self-Testing as a Quality Gate: Evidence-Driven Release Management for LLM Applications [51.56484100374058]
我々は,エビデンスに基づくリリース決定を伴う品質ゲートを導入する自動自己テストフレームワークを提案する。内部展開型多エージェント対話型AIシステムの縦型ケーススタディにより,本フレームワークの評価を行った。
論文参考訳（メタデータ） (2026-03-13T20:44:15Z)
Understanding by Reconstruction: Reversing the Software Development Process for LLM Pretraining [66.89012795621349]
大規模言語モデル(LLM)は、複雑なソフトウェア工学に必要な、深く、長期にわたる推論に苦しむことが多い。本稿では,再構築による理解という,新しいパラダイムを提案する。マルチエージェントシミュレーションを用いて潜在エージェント軌道を合成するフレームワークを提案する。
論文参考訳（メタデータ） (2026-03-11T09:23:20Z)
AutoAgent: Evolving Cognition and Elastic Memory Orchestration for Adaptive Agents [26.380991138110925]
AutoAgentは、認知の進化、オンザフライでのコンテキスト決定、弾力性のあるメモリオーケストレーションに基づく、自己進化型のマルチエージェントフレームワークである。各エージェントは、ツール、自己能力、同僚の専門知識、タスク知識に関する構造化されたプロンプトレベルの認知を維持する。 AutoAgentは、静的およびメモリ拡張ベースラインに対するタスク成功、ツール使用効率、共同ロバスト性を一貫して改善する。
論文参考訳（メタデータ） (2026-03-10T14:23:49Z)
The Why Behind the Action: Unveiling Internal Drivers via Agentic Attribution [63.61358761489141]
LLM(Large Language Model)ベースのエージェントは、カスタマーサービス、Webナビゲーション、ソフトウェアエンジニアリングといった現実世界のアプリケーションで広く使われている。本稿では,タスク結果に関係なく,エージェントの動作を駆動する内部要因を識別する,テキスト汎用エージェント属性のための新しいフレームワークを提案する。標準ツールの使用やメモリ誘起バイアスのような微妙な信頼性リスクなど、さまざまなエージェントシナリオでフレームワークを検証する。
論文参考訳（メタデータ） (2026-01-21T15:22:21Z)
A Benchmark for Procedural Memory Retrieval in Language Agents [0.023227405857540805]
現在のAIエージェントは、慣れ親しんだ設定で優れていますが、目に見えないProcで新しいタスクに直面したとき、急激に失敗します。タスク実行から手続き的メモリ検索を分離する最初のベンチマークを示す。埋め込み型手法は、慣れ親しんだ文脈で強く機能するが、新規な手法では著しく劣化する。
論文参考訳（メタデータ） (2025-11-21T08:08:53Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。