Fugu-MT 論文翻訳(概要): AnyUser: Translating Sketched User Intent into Domestic Robots

論文の概要: AnyUser: Translating Sketched User Intent into Domestic Robots

arxiv url: http://arxiv.org/abs/2604.04811v1
Date: Mon, 06 Apr 2026 16:16:00 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-07 15:49:19.275938
Title: AnyUser: Translating Sketched User Intent into Domestic Robots
Title（参考訳）: AnyUser: スケッチされたユーザーインテントを家庭用ロボットに翻訳する
Authors: Songyuan Yang, Huibin Tan, Kailun Yang, Wenjing Yang, Shaowu Yang,
Abstract要約: カメラ画像のフリーフォームスケッチによる直感的な家庭内タスク指導のための統合型ロボットインストラクションシステムであるAnyUserを紹介した。 AnyUserはマルチモーダル入力(スケッチ、ビジョン、言語)を空間意味プリミティブとして解釈し、実行可能なロボットアクションを生成する。
参考スコア（独自算出の注目度）: 21.747127540075756
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce AnyUser, a unified robotic instruction system for intuitive domestic task instruction via free-form sketches on camera images, optionally with language. AnyUser interprets multimodal inputs (sketch, vision, language) as spatial-semantic primitives to generate executable robot actions requiring no prior maps or models. Novel components include multimodal fusion for understanding and a hierarchical policy for robust action generation. Efficacy is shown via extensive evaluations: (1) Quantitative benchmarks on the large-scale dataset showing high accuracy in interpreting diverse sketch-based commands across various simulated domestic scenes. (2) Real-world validation on two distinct robotic platforms, a statically mounted 7-DoF assistive arm (KUKA LBR iiwa) and a dual-arm mobile manipulator (Realman RMC-AIDAL), performing representative tasks like targeted wiping and area cleaning, confirming the system's ability to ground instructions and execute them reliably in physical environments. (3) A comprehensive user study involving diverse demographics (elderly, simulated non-verbal, low technical literacy) demonstrating significant improvements in usability and task specification efficiency, achieving high task completion rates (85.7%-96.4%) and user satisfaction. AnyUser bridges the gap between advanced robotic capabilities and the need for accessible non-expert interaction, laying the foundation for practical assistive robots adaptable to real-world human environments.
Abstract（参考訳）: 我々は、カメラ画像のフリーフォームスケッチによる直感的な家庭内タスク指導のための統合型ロボットインストラクションシステムであるAnyUserを紹介した。 AnyUserはマルチモーダル入力(スケッチ、ビジョン、言語)を空間意味プリミティブとして解釈し、事前の地図やモデルを必要としない実行可能なロボットアクションを生成する。新たなコンポーネントには、理解のためのマルチモーダル融合と、堅牢なアクション生成のための階層的なポリシーが含まれる。 1) 様々なシミュレートされた家庭の場面で多様なスケッチベースのコマンドを解釈する際の高精度な大規模データセットの定量的ベンチマーク。 2) 静的に装着した7-DoF補助アーム(KUKA LBR Iiwa)とデュアルアーム移動マニピュレータ(Realman RMC-AIDAL)の2つの異なるロボットプラットフォームにおける実世界の検証を行い, 目標ワイピングやエリアクリーニングなどの代表的タスクを行い, システムによる指示の接地能力を確認し, 物理的環境下で確実に実行可能であることを確認した。 3) 多様な人口層(大半が非言語的,低技術リテラシー)を含む総合的ユーザスタディは,ユーザビリティとタスク仕様の効率化,タスク完了率(85.7%～96.4%)の達成,ユーザ満足度を著しく向上させた。 AnyUserは、高度なロボット能力と、アクセス可能な非専門家の相互作用のギャップを埋め、現実の人間の環境に適応する実用的な補助ロボットの基礎を築いた。

論文の概要: AnyUser: Translating Sketched User Intent into Domestic Robots

関連論文リスト