Fugu-MT 論文翻訳(概要): Optimizing Grasping in Legged Robots: A Deep Learning Approach to Loco-Manipulation

論文の概要: Optimizing Grasping in Legged Robots: A Deep Learning Approach to Loco-Manipulation

arxiv url: http://arxiv.org/abs/2508.17466v2
Date: Sat, 11 Oct 2025 16:20:50 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-14 15:48:09.227698
Title: Optimizing Grasping in Legged Robots: A Deep Learning Approach to Loco-Manipulation
Title（参考訳）: 足ロボットにおけるグラスピングの最適化:ロコマニピュレーションの深層学習アプローチ
Authors: Dilermando Almeida, Guilherme Lazzarini, Juliano Negri, Thiago H. Segreto, Ricardo V. Godoy, Marcelo Becker,
Abstract要約: 本稿では,腕を備えた四足歩行の把握能力を高めるための枠組みを提案する。そこで我々は,ジェネシスシミュレーション環境内にパイプラインを構築し,共通物体の把握試行の合成データセットを生成する。このデータセットは、オンボードのRGBとディープカメラからのマルチモーダル入力を処理するU-Netのようなアーキテクチャで、カスタムCNNのトレーニングに使用された。四脚ロボットの完全な枠組みを検証した。
参考スコア（独自算出の注目度）: 0.6533458718563319
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents a deep learning framework designed to enhance the grasping capabilities of quadrupeds equipped with arms, with a focus on improving precision and adaptability. Our approach centers on a sim-to-real methodology that minimizes reliance on physical data collection. We developed a pipeline within the Genesis simulation environment to generate a synthetic dataset of grasp attempts on common objects. By simulating thousands of interactions from various perspectives, we created pixel-wise annotated grasp-quality maps to serve as the ground truth for our model. This dataset was used to train a custom CNN with a U-Net-like architecture that processes multi-modal input from an onboard RGB and depth cameras, including RGB images, depth maps, segmentation masks, and surface normal maps. The trained model outputs a grasp-quality heatmap to identify the optimal grasp point. We validated the complete framework on a four-legged robot. The system successfully executed a full loco-manipulation task: autonomously navigating to a target object, perceiving it with its sensors, predicting the optimal grasp pose using our model, and performing a precise grasp. This work proves that leveraging simulated training with advanced sensing offers a scalable and effective solution for object handling.
Abstract（参考訳）: 本稿では,腕を備えた四足歩行の把握能力の向上を目的としたディープラーニングフレームワークについて,精度と適応性の向上に焦点をあてる。我々のアプローチは、物理データ収集への依存を最小限に抑えるシム・ツー・リアルな方法論に重点を置いている。我々は,ジェネシスシミュレーション環境内にパイプラインを構築し,共通物体の把握の試みを合成したデータセットを作成した。様々な視点から何千もの相互作用をシミュレートすることで、我々はモデルの基礎的真理として機能する、ピクセルワイズアノテートなグリップクオリティマップを作成しました。このデータセットは、オンボードのRGBと深度カメラからのマルチモーダル入力を処理するU-NetのようなアーキテクチャでカスタムCNNをトレーニングするために使用された。訓練されたモデルは、最適な把握点を特定するために、グリップ品質のヒートマップを出力する。四脚ロボットの完全な枠組みを検証した。システムは,対象物に自律的にナビゲートし,センサーで認識し,モデルを用いて最適な把握ポーズを予測し,正確な把握を行うという,完全なロコ操作タスクを成功裏に実行した。この研究は、高度なセンシングでシミュレートされたトレーニングを活用することで、オブジェクトハンドリングにスケーラブルで効果的なソリューションが提供されることを証明している。

論文の概要: Optimizing Grasping in Legged Robots: A Deep Learning Approach to Loco-Manipulation

関連論文リスト