Grasp-MPC: Closed-Loop Visual Grasping via Value-Guided Model Predictive Control
- URL: http://arxiv.org/abs/2509.06201v1
- Date: Sun, 07 Sep 2025 20:28:21 GMT
- Title: Grasp-MPC: Closed-Loop Visual Grasping via Value-Guided Model Predictive Control
- Authors: Jun Yamada, Adithyavairavan Murali, Ajay Mandlekar, Clemens Eppner, Ingmar Posner, Balakumar Sundaralingam,
- Abstract summary: We propose Grasp-MPC, a closed-loop vision-based grasping policy for novel objects in cluttered environments.<n> Grasp-MPC incorporates a value function, trained on visual observations from a large-scale synthetic dataset of 2 million grasp trajectories.<n>We evaluate Grasp-MPC on FetchBench and real-world settings across diverse environments.
- Score: 24.588260602136867
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Grasping of diverse objects in unstructured environments remains a significant challenge. Open-loop grasping methods, effective in controlled settings, struggle in cluttered environments. Grasp prediction errors and object pose changes during grasping are the main causes of failure. In contrast, closed-loop methods address these challenges in simplified settings (e.g., single object on a table) on a limited set of objects, with no path to generalization. We propose Grasp-MPC, a closed-loop 6-DoF vision-based grasping policy designed for robust and reactive grasping of novel objects in cluttered environments. Grasp-MPC incorporates a value function, trained on visual observations from a large-scale synthetic dataset of 2 million grasp trajectories that include successful and failed attempts. We deploy this learned value function in an MPC framework in combination with other cost terms that encourage collision avoidance and smooth execution. We evaluate Grasp-MPC on FetchBench and real-world settings across diverse environments. Grasp-MPC improves grasp success rates by up to 32.6% in simulation and 33.3% in real-world noisy conditions, outperforming open-loop, diffusion policy, transformer policy, and IQL approaches. Videos and more at http://grasp-mpc.github.io.
Related papers
- Multi-Paradigm Collaborative Adversarial Attack Against Multi-Modal Large Language Models [67.45032003041399]
We propose a novel Multi-Paradigm Collaborative Attack (MPCAttack) framework to boost the transferability of adversarial examples against MLLMs.<n>MPCO adaptively balances the importance of different paradigm representations and guides the global optimisation.<n>Our solution consistently outperforms state-of-the-art methods in both targeted and untargeted attacks on open-source and closed-source MLLMs.
arXiv Detail & Related papers (2026-03-05T06:01:26Z) - M4Diffuser: Multi-View Diffusion Policy with Manipulability-Aware Control for Robust Mobile Manipulation [17.9979990426915]
M4Diffuser is a hybrid framework that integrates a Multi-View Diffusion Policy with a novel Reduced and Manipulability-aware QP controller for mobile manipulation.<n>Our approach demonstrates robust performance for smooth whole-body coordination, and strong generalization to unseen tasks.
arXiv Detail & Related papers (2025-09-18T14:09:53Z) - Corner-Grasp: Multi-Action Grasp Detection and Active Gripper Adaptation for Grasping in Cluttered Environments [0.3565151496245486]
We propose a method for effectively grasping in cluttered bin-picking environments.<n>We utilize a multi-functional gripper that combines both suction and finger grasping.<n>We also present an active gripper adaptation strategy to minimize collisions between the gripper hardware and the surrounding environment.
arXiv Detail & Related papers (2025-04-02T16:12:28Z) - Easy-Poly: A Easy Polyhedral Framework For 3D Multi-Object Tracking [23.40561503456164]
We present Easy-Poly, a real-time, filter-based 3D MOT framework for multiple object categories.<n>Results show that Easy-Poly outperforms state-of-the-art methods such as Poly-MOT and Fast-Poly.<n>These findings highlight Easy-Poly's adaptability and robustness in diverse scenarios.
arXiv Detail & Related papers (2025-02-25T04:01:25Z) - Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection [56.66677293607114]
We propose Code-as-Monitor (CaM) for both open-set reactive and proactive failure detection.<n>To enhance the accuracy and efficiency of monitoring, we introduce constraint elements that abstract constraint-related entities.<n>Experiments show that CaM achieves a 28.7% higher success rate and reduces execution time by 31.8% under severe disturbances.
arXiv Detail & Related papers (2024-12-05T18:58:27Z) - BEVTrack: A Simple and Strong Baseline for 3D Single Object Tracking in Bird's-Eye View [54.48052449493636]
3D Single Object Tracking (SOT) is a fundamental task in computer vision and plays a critical role in applications like autonomous driving.<n>We propose BEVTrack, a simple yet effective motion-based tracking method.<n>We show that BEVTrack achieves state-of-the-art results while operating at 200 FPS, enabling real-time applicability.
arXiv Detail & Related papers (2023-09-05T12:42:26Z) - Learning to Mask and Permute Visual Tokens for Vision Transformer Pre-Training [55.12082817901671]
We propose a new self-supervised pre-training approach, named Masked and Permuted Vision Transformer (MaPeT)<n>MaPeT employs autoregressive and permuted predictions to capture intra-patch dependencies.<n>Our results demonstrate that MaPeT achieves competitive performance on ImageNet, compared to baselines and competitors under the same model setting.
arXiv Detail & Related papers (2023-06-12T18:12:19Z) - Environment-aware Interactive Movement Primitives for Object Reaching in
Clutter [4.5459332718995205]
We propose a constrained multi-objective optimization framework (OptI-ProMP) to approach the problem of reaching a target in a compact clutter.
OptI-ProMP features costs related to both static, dynamic and pushable objects in the target neighborhood, and it relies on probabilistic primitives for problem initialisation.
We tested, in a simulated poly-tunnel, both ProMP-based planners from literature and the OptI-ProMP, on low (3-dofs) and high (7-dofs) dexterity robot body.
arXiv Detail & Related papers (2022-10-28T15:03:23Z) - Neural Motion Fields: Encoding Grasp Trajectories as Implicit Value
Functions [65.84090965167535]
We present Neural Motion Fields, a novel object representation which encodes both object point clouds and the relative task trajectories as an implicit value function parameterized by a neural network.
This object-centric representation models a continuous distribution over the SE(3) space and allows us to perform grasping reactively by leveraging sampling-based MPC to optimize this value function.
arXiv Detail & Related papers (2022-06-29T18:47:05Z) - Nonprehensile Riemannian Motion Predictive Control [57.295751294224765]
We introduce a novel Real-to-Sim reward analysis technique to reliably imagine and predict the outcome of taking possible actions for a real robotic platform.
We produce a closed-loop controller to reactively push objects in a continuous action space.
We observe that RMPC is robust in cluttered as well as occluded environments and outperforms the baselines.
arXiv Detail & Related papers (2021-11-15T18:50:04Z) - Collision-Aware Target-Driven Object Grasping in Constrained
Environments [10.934615956723672]
We propose a novel Collision-Aware Reachability Predictor (CARP) for 6-DoF grasping systems.
The CARP learns to estimate the collision-free probabilities for grasp poses and significantly improves grasping in challenging environments.
The experiments in both simulation and the real world show that our approach achieves more than 75% grasping rate on novel objects.
arXiv Detail & Related papers (2021-04-01T21:44:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.