Fugu-MT 論文翻訳(概要): OpenJarvis: Personal AI, On Personal Devices

論文の概要: OpenJarvis: Personal AI, On Personal Devices

arxiv url: http://arxiv.org/abs/2605.17172v1
Date: Sat, 16 May 2026 22:00:10 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-19 17:57:47.719796
Title: OpenJarvis: Personal AI, On Personal Devices
Title（参考訳）: OpenJarvis:パーソナルデバイス上でのパーソナルAI
Authors: Jon Saad-Falcon, Avanika Narayan, Robby Manihani, Tanvir Bhathal, Herumb Shandilya, Hakki Orhun Akengin, Gabriel Bo, Andrew Park, Matthew Hart, Caia Costello, Chuan Li, Christopher Ré, Azalia Mirhoseini,
Abstract要約: OpenJarvisは、5つのプリミティブにまたがる型付き仕様として、パーソナルAIシステムを表すアーキテクチャである。各プリミティブは独立して編集可能なフィールドであり、スタックを最適化し、精度、コスト、レイテンシに対して測定することができる。
参考スコア（独自算出の注目度）: 35.387857183518484
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Personal AI stacks, like OpenClaw and Hermes Agent, are becoming central to daily work, yet they route nearly every query (often over sensitive local data) to cloud-hosted frontier models. Replacing frontier models with local models inside existing stacks does not work: swapping Claude Opus 4.6 for Qwen3.5-9B drops accuracy by 25-39 pp across personal AI tasks like PinchBench and GAIA. Existing stacks bundle agentic prompts, tool descriptions, memory configuration, and runtime settings around a specific cloud model. Only the prompts can be tuned, and state-of-the-art prompt optimizers close just 5 pp of the local-cloud gap on their own. This motivates a decomposed personal AI stack: one that exposes individual primitives which can be optimized individually or jointly to close the local-cloud gap. We present OpenJarvis, an architecture that represents a personal AI system as a typed spec over five primitives: Intelligence, Engine, Agents, Tools & Memory, and Learning. Each primitive is an independently editable field, making the stack end-to-end optimizable and measurable against accuracy, cost, and latency. Towards closing the local-cloud gap without surrendering local-model properties, OpenJarvis introduces LLM-guided spec search, a local-cloud collaboration in which frontier cloud models propose edits across the spec at search time, only non-regressing edits are accepted, and the resulting spec runs entirely on-device at inference time. With LLM-guided spec search, on-device specs match or exceed cloud accuracy on 4 of 8 benchmarks and land within 3.2 pp of the best cloud baseline on average. They also reduce marginal API cost by ~800x and end-to-end latency by 4x.
Abstract（参考訳）: OpenClawやHermes AgentといったパーソナルAIスタックは、日々の作業の中心になっているが、ほとんどすべてのクエリ(多くの場合、機密性の高いローカルデータ)をクラウドにホストされたフロンティアモデルにルーティングする。クロードオプス4.6をQwen3.5-9Bに置き換えると、PinchBenchやGAIAといったパーソナルAIタスクで精度が25-39pp低下する。既存のスタックはエージェントプロンプト、ツール記述、メモリ設定、実行時設定を特定のクラウドモデルにバンドルする。プロンプトのみをチューニング可能で、最先端のプロンプトオプティマイザは、ローカル-クラウドギャップのわずか5ppにすぎません。個々のプリミティブを公開して、ローカルとクラウドのギャップを埋めるために、個別または共同で最適化できるものだ。 OpenJarvisは、インテリジェンス、エンジン、エージェント、ツール&メモリ、学習という5つのプリミティブのタイプドスペックとして、パーソナルAIシステムを表すアーキテクチャである。各プリミティブは独立して編集可能なフィールドであり、スタックを最適化し、正確性、コスト、レイテンシに対して測定することができる。 OpenJarvisは、ローカルモデルプロパティを放棄することなく、ローカル-クラウドギャップを閉じるために、LLM誘導スペックサーチを導入している。これは、フロンティアクラウドモデルが検索時に仕様全体にわたって編集を提案し、非回帰編集のみが受け入れられ、その結果の仕様は完全に推論時にデバイス上で実行される、ローカル-クラウドコラボレーションである。 LLMによるスペックサーチでは、オンデバイス仕様は8つのベンチマークのうち4つのベンチマークでクラウドの精度と一致し、平均して3.2ppp以内に着陸する。また、限界APIコストを約800倍削減し、エンドツーエンドのレイテンシを4倍削減する。

論文の概要: OpenJarvis: Personal AI, On Personal Devices

関連論文リスト