MARS: Multimodal Active Robotic Sensing for Articulated Characterization
- URL: http://arxiv.org/abs/2407.01191v1
- Date: Mon, 1 Jul 2024 11:32:39 GMT
- Title: MARS: Multimodal Active Robotic Sensing for Articulated Characterization
- Authors: Hongliang Zeng, Ping Zhang, Chengjiong Wu, Jiahua Wang, Tingyu Ye, Fang Li,
- Abstract summary: We introduce MARS, a novel framework for articulated object characterization.
It features a multi-modal fusion module utilizing multi-scale RGB features to enhance point cloud features.
Our method effectively generalizes to real-world articulated objects, enhancing robot interactions.
- Score: 6.69660410213287
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Precise perception of articulated objects is vital for empowering service robots. Recent studies mainly focus on point cloud, a single-modal approach, often neglecting vital texture and lighting details and assuming ideal conditions like optimal viewpoints, unrepresentative of real-world scenarios. To address these limitations, we introduce MARS, a novel framework for articulated object characterization. It features a multi-modal fusion module utilizing multi-scale RGB features to enhance point cloud features, coupled with reinforcement learning-based active sensing for autonomous optimization of observation viewpoints. In experiments conducted with various articulated object instances from the PartNet-Mobility dataset, our method outperformed current state-of-the-art methods in joint parameter estimation accuracy. Additionally, through active sensing, MARS further reduces errors, demonstrating enhanced efficiency in handling suboptimal viewpoints. Furthermore, our method effectively generalizes to real-world articulated objects, enhancing robot interactions. Code is available at https://github.com/robhlzeng/MARS.
Related papers
- An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition [49.45660055499103]
Zero-shot human skeleton-based action recognition aims to construct a model that can recognize actions outside the categories seen during training.
Previous research has focused on aligning sequences' visual and semantic spatial distributions.
We introduce a new loss function sampling method to obtain a tight and robust representation.
arXiv Detail & Related papers (2024-06-02T06:53:01Z) - RPMArt: Towards Robust Perception and Manipulation for Articulated Objects [56.73978941406907]
It is essential that robots can exhibit robust perception and manipulation for articulated objects in real-world robotic applications.
We propose a framework towards Robust Perception and Manipulation for Articulated Objects ( RPMArt)
RPMArt learns to estimate the articulation parameters and manipulate the articulation part from the noisy point cloud.
arXiv Detail & Related papers (2024-03-24T05:55:39Z) - MOKA: Open-Vocabulary Robotic Manipulation through Mark-Based Visual
Prompting [106.53784213239479]
We present MOKA (Marking Open-vocabulary Keypoint Affordances), an approach that employs vision language models to solve robotic manipulation tasks.
At the heart of our approach is a compact point-based representation of affordance and motion that bridges the VLM's predictions on RGB images and the robot's motions in the physical world.
We evaluate and analyze MOKA's performance on a variety of manipulation tasks specified by free-form language descriptions.
arXiv Detail & Related papers (2024-03-05T18:08:45Z) - Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning [58.69297999175239]
In robot learning, the observation space is crucial due to the distinct characteristics of different modalities.
In this study, we explore the influence of various observation spaces on robot learning, focusing on three predominant modalities: RGB, RGB-D, and point cloud.
arXiv Detail & Related papers (2024-02-04T14:18:45Z) - Mutual Information Regularization for Weakly-supervised RGB-D Salient
Object Detection [33.210575826086654]
We present a weakly-supervised RGB-D salient object detection model via supervision.
We focus on effective multimodal representation learning via inter-modal mutual information regularization.
arXiv Detail & Related papers (2023-06-06T12:36:57Z) - Smart Explorer: Recognizing Objects in Dense Clutter via Interactive
Exploration [31.38518623440405]
Recognizing objects in dense clutter accurately plays an important role to a wide variety of robotic manipulation tasks.
We propose an interactive exploration framework called Smart Explorer for recognizing all objects in dense clutters.
arXiv Detail & Related papers (2022-08-06T11:04:04Z) - Efficient and Robust Training of Dense Object Nets for Multi-Object
Robot Manipulation [8.321536457963655]
We propose a framework for robust and efficient training of Dense Object Nets (DON)
We focus on training with multi-object data instead of singulated objects, combined with a well-chosen augmentation scheme.
We demonstrate the robustness and accuracy of our proposed framework on a real-world robotic grasping task.
arXiv Detail & Related papers (2022-06-24T08:24:42Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic
Grasping via Physics-based Metaverse Synthesis [78.26022688167133]
We present a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis.
The proposed dataset contains 100,000 images and 25 different object types.
We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance.
arXiv Detail & Related papers (2021-12-29T17:23:24Z) - Nonprehensile Riemannian Motion Predictive Control [57.295751294224765]
We introduce a novel Real-to-Sim reward analysis technique to reliably imagine and predict the outcome of taking possible actions for a real robotic platform.
We produce a closed-loop controller to reactively push objects in a continuous action space.
We observe that RMPC is robust in cluttered as well as occluded environments and outperforms the baselines.
arXiv Detail & Related papers (2021-11-15T18:50:04Z) - Improving Object Permanence using Agent Actions and Reasoning [8.847502932609737]
Existing approaches learn object permanence from low-level perception.
We argue that object permanence can be improved when the robot uses knowledge about executed actions.
arXiv Detail & Related papers (2021-10-01T07:09:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.