Fugu-MT 論文翻訳(概要): Interpretable Robot Control via Structured Behavior Trees and Large Language Models

論文の概要: Interpretable Robot Control via Structured Behavior Trees and Large Language Models

arxiv url: http://arxiv.org/abs/2508.09621v1
Date: Wed, 13 Aug 2025 08:53:13 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-14 20:42:00.819503
Title: Interpretable Robot Control via Structured Behavior Trees and Large Language Models
Title（参考訳）: 構造化行動木と大規模言語モデルによる解釈ロボット制御
Authors: Ingrid Maéva Chekam, Ines Pastor-Martinez, Ali Tourani, Jose Andres Millan-Romera, Laura Ribeiro, Pedro Miguel Bastos Soares, Holger Voos, Jose Luis Sanchez-Lopez,
Abstract要約: 本稿では,自然言語理解とロボット実行を橋渡しする新しい枠組みを提案する。提案手法は実世界のシナリオでは実用的であり、平均的な認識と実行の精度は約94%である。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As intelligent robots become more integrated into human environments, there is a growing need for intuitive and reliable Human-Robot Interaction (HRI) interfaces that are adaptable and more natural to interact with. Traditional robot control methods often require users to adapt to interfaces or memorize predefined commands, limiting usability in dynamic, unstructured environments. This paper presents a novel framework that bridges natural language understanding and robotic execution by combining Large Language Models (LLMs) with Behavior Trees. This integration enables robots to interpret natural language instructions given by users and translate them into executable actions by activating domain-specific plugins. The system supports scalable and modular integration, with a primary focus on perception-based functionalities, such as person tracking and hand gesture recognition. To evaluate the system, a series of real-world experiments was conducted across diverse environments. Experimental results demonstrate that the proposed approach is practical in real-world scenarios, with an average cognition-to-execution accuracy of approximately 94%, making a significant contribution to HRI systems and robots. The complete source code of the framework is publicly available at https://github.com/snt-arg/robot_suite.
Abstract（参考訳）: インテリジェントなロボットがより人間環境に統合されるにつれ、直感的で信頼性の高いヒューマンロボットインタラクション(HRI)インターフェースの必要性が高まっている。従来のロボット制御手法では、ユーザーがインターフェースに適応したり、事前に定義されたコマンドを記憶する必要があることが多く、動的で非構造的な環境におけるユーザビリティを制限する。本稿では,Large Language Models (LLM) と振舞い木を組み合わせることで,自然言語理解とロボット実行を橋渡しする新しいフレームワークを提案する。この統合により、ロボットはユーザから与えられた自然言語命令を解釈し、ドメイン固有のプラグインをアクティベートすることで実行可能なアクションに変換することができる。このシステムはスケーラブルでモジュール化された統合をサポートしており、主に人物追跡や手動ジェスチャー認識などの知覚に基づく機能に重点を置いている。システムを評価するために,様々な環境にまたがる実環境実験を行った。実験の結果,提案手法は実世界のシナリオでは現実的であり,平均的な認識と実行の精度は約94%であり,HRIシステムやロボットに多大な貢献をしていることがわかった。フレームワークのソースコードはhttps://github.com/snt-arg/robot_suite.comで公開されている。

関連論文リスト

$π_0$: A Vision-Language-Action Flow Model for General Robot Control [77.32743739202543]
本稿では,インターネット規模のセマンティック知識を継承するために,事前学習された視覚言語モデル(VLM)上に構築された新しいフローマッチングアーキテクチャを提案する。我々は,事前訓練後のタスクをゼロショットで実行し,人からの言語指導に追従し,微調整で新たなスキルを習得する能力の観点から,我々のモデルを評価した。
論文参考訳（メタデータ） (2024-10-31T17:22:30Z)
RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation [77.41969287400977]
本稿では,コード生成を利用したデプロイ可能なロボット操作パイプラインのためのプラットフォームである textbfRobotScript を提案する。自由形自然言語におけるロボット操作タスクのためのコード生成ベンチマークも提案する。我々は,Franka と UR5 のロボットアームを含む,複数のロボットエボディメントにまたがるコード生成フレームワークの適応性を実証した。
論文参考訳（メタデータ） (2024-02-22T15:12:00Z)
Exploring Large Language Models to Facilitate Variable Autonomy for Human-Robot Teaming [4.779196219827508]
本稿では,VR(Unity Virtual Reality)設定に基づく,GPTを利用したマルチロボットテストベッド環境のための新しいフレームワークを提案する。このシステムにより、ユーザーは自然言語でロボットエージェントと対話でき、それぞれが個々のGPTコアで動く。 12人の参加者によるユーザスタディでは、GPT-4の有効性と、さらに重要なのは、マルチロボット環境で自然言語で会話する機会を与えられる際のユーザ戦略について検討している。
論文参考訳（メタデータ） (2023-12-12T12:26:48Z)
Incremental Learning of Humanoid Robot Behavior from Natural Interaction and Large Language Models [23.945922720555146]
本研究では,自然相互作用から複雑な行動の漸進的な学習を実現するシステムを提案する。本システムは,ヒューマノイドロボットARMAR-6のロボット認知アーキテクチャに組み込まれている。
論文参考訳（メタデータ） (2023-09-08T13:29:05Z)
WALL-E: Embodied Robotic WAiter Load Lifting with Large Language Model [92.90127398282209]
本稿では,最新のLarge Language Models(LLM)と既存のビジュアルグラウンドとロボットグルーピングシステムを統合する可能性について検討する。本稿では,この統合の例としてWALL-E (Embodied Robotic WAiter load lifting with Large Language model)を紹介する。我々は,このLCMを利用したシステムを物理ロボットに展開し,よりユーザフレンドリなインタフェースで指導誘導型把握タスクを実現する。
論文参考訳（メタデータ） (2023-08-30T11:35:21Z)
Reshaping Robot Trajectories Using Natural Language Commands: A Study of Multi-Modal Data Alignment Using Transformers [33.7939079214046]
我々は、人間とロボットのコラボレーションのための柔軟な言語ベースのインタフェースを提供する。我々は、ユーザコマンドをエンコードする大規模言語モデルの分野における最近の進歩を生かしている。言語コマンドによって修正されたロボット軌跡を含むデータセット上で、模倣学習を用いてモデルを訓練する。
論文参考訳（メタデータ） (2022-03-25T01:36:56Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。