Fugu-MT 論文翻訳(概要): BioProVLA-Agent: An Affordable, Protocol-Driven, Vision-Enhanced VLA-Enabled Embodied Multi-Agent System with Closed-Loop-Capable Reasoning for Biological Laboratory Manipulation

論文の概要: BioProVLA-Agent: An Affordable, Protocol-Driven, Vision-Enhanced VLA-Enabled Embodied Multi-Agent System with Closed-Loop-Capable Reasoning for Biological Laboratory Manipulation

arxiv url: http://arxiv.org/abs/2605.07306v1
Date: Fri, 08 May 2026 06:15:40 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-11 19:43:38.849263
Title: BioProVLA-Agent: An Affordable, Protocol-Driven, Vision-Enhanced VLA-Enabled Embodied Multi-Agent System with Closed-Loop-Capable Reasoning for Biological Laboratory Manipulation
Title（参考訳）: BioProVLA-Agent:生物実験室マニピュレーションのためのクローズドループキャパブル推論を用いた拡張型VLA-Enmbodied Multi-Agentシステム
Authors: Zhaohui Du, Zhe Wang, Hongmei Fei, Xiwen Cao, Ting Xiao, Qi Wang, Huanbo Jin, Jiaming Gu, Quan Lu, Zhe Liu,
Abstract要約: BioProVLA-Agent(バイオプロVLA-Agent)は、生物学的操作のためのプロトコル駆動型、ビジョン駆動型多エージェントシステムである。クローズドループワークフローでは、プロトコル解析、視覚的状態検証、具体的実行を統合している。本研究では,15の原子タスク,6つの複合タスク,3つのバイマニュアルタスクを網羅した階層的ベンチマークによるシステム評価を行った。
参考スコア（独自算出の注目度）: 15.461936066333633
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Biological laboratory automation can reduce repetitive manual work and improve reproducibility, but reliable embodied execution in wet-lab environments remains challenging. Protocols are often unstructured, labware is frequently transparent or reflective, and multi-step procedures require state-aware execution beyond one-shot instruction following. Existing robotic systems often rely on costly hardware, fixed workflows, dedicated instruments, or robotics-oriented interfaces. Here, we introduce BioProVLA-Agent, an affordable, protocol-driven, vision-enhanced embodied multi-agent system enabled by Vision-Language-Action (VLA) models for biological manipulation. The system uses protocols as the task interface and integrates protocol parsing, visual state verification, and embodied execution in a closed-loop workflow. A Tailored LLM Protocol Agent converts protocols into verifiable subtasks; a VLM-RAG Verification Agent assesses readiness and completion using observations, robot states, retrieved knowledge, and success/failure examples; and a VLA Embodied Agent executes verified subtasks through a lightweight policy. To improve robustness under wet-lab visual perturbations, we develop AugSmolVLA, an online augmentation strategy targeting transparent labware, reflections, illumination shifts, and overexposure. We evaluate the system on a hierarchical benchmark covering 15 atomic tasks, 6 composite workflows, and 3 bimanual tasks, including tube loading, sorting, waste disposal, cap twisting, and liquid pouring. Across normal and high-exposure settings, AugSmolVLA improves execution stability over ACT, X-VLA, and the original SmolVLA, especially for precise placement, transparent-object manipulation, composite workflows, and visually degraded scenes. These results suggest a practical route toward accessible, protocol-centered, and verification-capable embodied AI for biological manipulation.
Abstract（参考訳）: 生物実験室の自動化は反復的な手作業の削減と再現性の向上に寄与するが、湿原環境における信頼性の高い実施は依然として困難である。プロトコルは、しばしば非構造化され、ラボウェアは、しばしば透過的または反射的であり、マルチステッププロシージャは、一発の命令以上の状態認識実行を必要とする。既存のロボットシステムは、高価なハードウェア、固定されたワークフロー、専用の機器、あるいはロボット指向のインターフェースに依存していることが多い。本稿では、バイオプロVLA-Agentについて紹介する。ビジュアル・ランゲージ・アクション(VLA)モデルで生物学的操作を可能にする、安価な、プロトコル駆動型、視覚強調型マルチエージェントシステムである。このシステムは、プロトコルをタスクインターフェースとして使用し、クローズドループワークフローにおけるプロトコル解析、視覚的状態検証、具体化実行を統合する。テイラードLLMプロトコルエージェントは、プロトコルを検証可能なサブタスクに変換し、VLM-RAG検証エージェントは、観察、ロボット状態、検索された知識、成功/失敗例を用いて、準備と完了を評価する。 AugSmolVLAは、透明なラボウェア、リフレクション、照明シフト、過剰露光をターゲットとしたオンライン強化戦略である。本研究では,15の原子タスク,6つの複合ワークフロー,およびチューブローディング,ソート,廃棄物処理,キャップねじれ,液注を含む3つのバイマニュアルタスクを網羅した階層的評価を行った。 AugSmolVLAは、通常および高露光設定全体にわたって、ACT、X-VLA、および元のSmolVLAに対する実行安定性を改善し、特に正確な配置、透明オブジェクト操作、複合ワークフロー、視覚的に劣化したシーンに対して有効である。これらの結果は、生物学的操作のための、アクセス可能で、プロトコル中心で、検証可能な実施可能なAIへの実践的な道のりを示唆している。

論文の概要: BioProVLA-Agent: An Affordable, Protocol-Driven, Vision-Enhanced VLA-Enabled Embodied Multi-Agent System with Closed-Loop-Capable Reasoning for Biological Laboratory Manipulation

関連論文リスト