Related papers: MARS: Multimodal Active Robotic Sensing for Articulated Characterization

MARS: Multimodal Active Robotic Sensing for Articulated Characterization

URL: http://arxiv.org/abs/2407.01191v1
Date: Mon, 1 Jul 2024 11:32:39 GMT
Title: MARS: Multimodal Active Robotic Sensing for Articulated Characterization
Authors: Hongliang Zeng, Ping Zhang, Chengjiong Wu, Jiahua Wang, Tingyu Ye, Fang Li,
Abstract summary: We introduce MARS, a novel framework for articulated object characterization. It features a multi-modal fusion module utilizing multi-scale RGB features to enhance point cloud features. Our method effectively generalizes to real-world articulated objects, enhancing robot interactions.
Score: 6.69660410213287
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Precise perception of articulated objects is vital for empowering service robots. Recent studies mainly focus on point cloud, a single-modal approach, often neglecting vital texture and lighting details and assuming ideal conditions like optimal viewpoints, unrepresentative of real-world scenarios. To address these limitations, we introduce MARS, a novel framework for articulated object characterization. It features a multi-modal fusion module utilizing multi-scale RGB features to enhance point cloud features, coupled with reinforcement learning-based active sensing for autonomous optimization of observation viewpoints. In experiments conducted with various articulated object instances from the PartNet-Mobility dataset, our method outperformed current state-of-the-art methods in joint parameter estimation accuracy. Additionally, through active sensing, MARS further reduces errors, demonstrating enhanced efficiency in handling suboptimal viewpoints. Furthermore, our method effectively generalizes to real-world articulated objects, enhancing robot interactions. Code is available at https://github.com/robhlzeng/MARS.

Related papers

Body-Hand Modality Expertized Networks with Cross-attention for Fine-grained Skeleton Action Recognition [28.174638880324014]
BHaRNet is a novel framework that augments a typical body-expert model with a hand-expert model. Our model jointly trains both streams with an ensemble loss that fosters cooperative specialization. Inspired by MMNet, we also demonstrate the applicability of our approach to multi-modal tasks by leveraging RGB information.
arXiv Detail & Related papers (2025-03-19T07:54:52Z)
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning [67.72413262980272]
Pre-trained vision models (PVMs) are fundamental to modern robotics, yet their optimal configuration remains unclear. We develop SlotMIM, a method that induces object-centric representations by introducing a semantic bottleneck. Our approach achieves significant improvements over prior work in image recognition, scene understanding, and robot learning evaluations.
arXiv Detail & Related papers (2025-03-10T06:18:31Z)
Keypoint Abstraction using Large Models for Object-Relative Imitation Learning [78.92043196054071]
Generalization to novel object configurations and instances across diverse tasks and environments is a critical challenge in robotics. Keypoint-based representations have been proven effective as a succinct representation for essential object capturing features. We propose KALM, a framework that leverages large pre-trained vision-language models to automatically generate task-relevant and cross-instance consistent keypoints.
arXiv Detail & Related papers (2024-10-30T17:37:31Z)
Moving Object Segmentation in Point Cloud Data using Hidden Markov Models [0.0]
We propose a robust learning-free approach to segment moving objects in point cloud data. The proposed approach is tested on benchmark datasets and consistently performs better than or as well as state-of-the-art methods.
arXiv Detail & Related papers (2024-10-24T10:56:02Z)
Articulated Object Manipulation using Online Axis Estimation with SAM2-Based Tracking [59.87033229815062]
Articulated object manipulation requires precise object interaction, where the object's axis must be carefully considered. Previous research employed interactive perception for manipulating articulated objects, but typically, open-loop approaches often suffer from overlooking the interaction dynamics. We present a closed-loop pipeline integrating interactive perception with online axis estimation from segmented 3D point clouds.
arXiv Detail & Related papers (2024-09-24T17:59:56Z)
RPMArt: Towards Robust Perception and Manipulation for Articulated Objects [56.73978941406907]
We propose a framework towards Robust Perception and Manipulation for Articulated Objects ( RPMArt) RPMArt learns to estimate the articulation parameters and manipulate the articulation part from the noisy point cloud. We introduce an articulation-aware classification scheme to enhance its ability for sim-to-real transfer.
arXiv Detail & Related papers (2024-03-24T05:55:39Z)
Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning [58.69297999175239]
In robot learning, the observation space is crucial due to the distinct characteristics of different modalities. In this study, we explore the influence of various observation spaces on robot learning, focusing on three predominant modalities: RGB, RGB-D, and point cloud.
arXiv Detail & Related papers (2024-02-04T14:18:45Z)
Smart Explorer: Recognizing Objects in Dense Clutter via Interactive Exploration [31.38518623440405]
Recognizing objects in dense clutter accurately plays an important role to a wide variety of robotic manipulation tasks. We propose an interactive exploration framework called Smart Explorer for recognizing all objects in dense clutters.
arXiv Detail & Related papers (2022-08-06T11:04:04Z)
MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic Grasping via Physics-based Metaverse Synthesis [78.26022688167133]
We present a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis. The proposed dataset contains 100,000 images and 25 different object types. We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance.
arXiv Detail & Related papers (2021-12-29T17:23:24Z)
Nonprehensile Riemannian Motion Predictive Control [57.295751294224765]
We introduce a novel Real-to-Sim reward analysis technique to reliably imagine and predict the outcome of taking possible actions for a real robotic platform. We produce a closed-loop controller to reactively push objects in a continuous action space. We observe that RMPC is robust in cluttered as well as occluded environments and outperforms the baselines.
arXiv Detail & Related papers (2021-11-15T18:50:04Z)
Improving Object Permanence using Agent Actions and Reasoning [8.847502932609737]
Existing approaches learn object permanence from low-level perception. We argue that object permanence can be improved when the robot uses knowledge about executed actions.
arXiv Detail & Related papers (2021-10-01T07:09:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.