Related papers: Consolidating Kinematic Models to Promote Coordinated Mobile Manipulations

Consolidating Kinematic Models to Promote Coordinated Mobile Manipulations

URL: http://arxiv.org/abs/2108.01264v3
Date: Tue, 10 Aug 2021 07:57:11 GMT
Title: Consolidating Kinematic Models to Promote Coordinated Mobile Manipulations
Authors: Ziyuan Jiao, Zeyu Zhang, Xin Jiang, David Han, Song-Chun Zhu, Yixin Zhu, Hangxin Liu
Abstract summary: We construct a Virtual Kinematic Chain (VKC) that consolidates the kinematics of the mobile base, the arm, and the object to be manipulated in mobile manipulations. A mobile manipulation task is represented by altering the state of the constructed VKC, which can be converted to a motion planning problem.
Score: 96.03270112422514
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We construct a Virtual Kinematic Chain (VKC) that readily consolidates the kinematics of the mobile base, the arm, and the object to be manipulated in mobile manipulations. Accordingly, a mobile manipulation task is represented by altering the state of the constructed VKC, which can be converted to a motion planning problem, formulated, and solved by trajectory optimization. This new VKC perspective of mobile manipulation allows a service robot to (i) produce well-coordinated motions, suitable for complex household environments, and (ii) perform intricate multi-step tasks while interacting with multiple objects without an explicit definition of intermediate goals. In simulated experiments, we validate these advantages by comparing the VKC-based approach with baselines that solely optimize individual components. The results manifest that VKC-based joint modeling and planning promote task success rates and produce more efficient trajectories.

Related papers

AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation [31.314066269767057]
Mobile manipulation has attracted increasing attention for enabling language-conditioned robotic control in household tasks.<n>Existing methods fail to explicitly model the influence of the mobile base on manipulator control.<n>We propose the Adaptive Coordination Diffusion Transformer (AC-DiT) to enhance mobile base and manipulator coordination.
arXiv Detail & Related papers (2025-07-02T17:59:54Z)
SILK: Smooth InterpoLation frameworK for motion in-betweening A Simplified Computational Approach [1.7812314225208412]
Motion in-betweening is a crucial tool for animators, enabling control over pose-level details in each pose.<n>Recent machine learning solutions for motion in-betweening rely on complex models, skeleton-aware architectures or requiring multiple modules and training steps.<n>We introduce a simple yet effective Transformer-based framework, employing a single encoder to synthesize realistic motions for motion in-betweening tasks.
arXiv Detail & Related papers (2025-06-09T19:26:27Z)
Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control [72.00655365269]
We present RoboMaster, a novel framework that models inter-object dynamics through a collaborative trajectory formulation.<n>Unlike prior methods that decompose objects, our core is to decompose the interaction process into three sub-stages: pre-interaction, interaction, and post-interaction.<n>Our method outperforms existing approaches, establishing new state-of-the-art performance in trajectory-controlled video generation for robotic manipulation.
arXiv Detail & Related papers (2025-06-02T17:57:06Z)
Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy [56.424032454461695]
We present Dita, a scalable framework that leverages Transformer architectures to directly denoise continuous action sequences. Dita employs in-context conditioning -- enabling fine-grained alignment between denoised actions and raw visual tokens from historical observations. Dita effectively integrates cross-embodiment datasets across diverse camera perspectives, observation scenes, tasks, and action spaces.
arXiv Detail & Related papers (2025-03-25T15:19:56Z)
ManipGPT: Is Affordance Segmentation by Large Vision Models Enough for Articulated Object Manipulation? [17.356760351203715]
This paper introduces ManipGPT, a framework designed to predict optimal interaction areas for articulated objects. We created a dataset of 9.9k simulated and real images to bridge the sim-to-real gap. We significantly improved part-level affordance segmentation, adapting the model's in-context segmentation capabilities to robot manipulation scenarios.
arXiv Detail & Related papers (2024-12-13T11:22:01Z)
Self-Supervised Learning of Grasping Arbitrary Objects On-the-Move [8.445514342786579]
This study introduces three fully convolutional neural network (FCN) models to predict static grasp primitive, dynamic grasp primitive, and residual moving velocity error from visual inputs. The proposed method achieved the highest grasping accuracy and pick-and-place efficiency.
arXiv Detail & Related papers (2024-11-15T02:59:16Z)
Cooperative and Asynchronous Transformer-based Mission Planning for Heterogeneous Teams of Mobile Robots [1.1049608786515839]
We propose a Cooperative and Asynchronous Transformer-based Mission Planning (CATMiP) framework to coordinate distributed decision making among agents. We evaluate CATMiP in a 2D grid-world simulation environment and compare its performance against planning-based exploration methods.
arXiv Detail & Related papers (2024-10-08T21:14:09Z)
CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications [59.193626019860226]
Vision Transformers (ViTs) mark a revolutionary advance in neural networks with their token mixer's powerful global context capability. We introduce CAS-ViT: Convolutional Additive Self-attention Vision Transformers. We show that CAS-ViT achieves a competitive performance when compared to other state-of-the-art backbones.
arXiv Detail & Related papers (2024-08-07T11:33:46Z)
Nonprehensile Riemannian Motion Predictive Control [57.295751294224765]
We introduce a novel Real-to-Sim reward analysis technique to reliably imagine and predict the outcome of taking possible actions for a real robotic platform. We produce a closed-loop controller to reactively push objects in a continuous action space. We observe that RMPC is robust in cluttered as well as occluded environments and outperforms the baselines.
arXiv Detail & Related papers (2021-11-15T18:50:04Z)
Efficient Task Planning for Mobile Manipulation: a Virtual Kinematic Chain Perspective [88.25410628450453]
We present a Virtual Kinematic Chain perspective to improve task planning efficacy for mobile manipulation. By consolidating the kinematics of the mobile base, the arm, and the object being manipulated collectively as a whole, this novel VKC perspective naturally defines abstract actions. In experiments, we implement a task planner using Domain Planning Definition Language (PDDL) with VKC.
arXiv Detail & Related papers (2021-08-03T02:49:18Z)
EAN: Event Adaptive Network for Enhanced Action Recognition [66.81780707955852]
We propose a unified action recognition framework to investigate the dynamic nature of video content. First, when extracting local cues, we generate the spatial-temporal kernels of dynamic-scale to adaptively fit the diverse events. Second, to accurately aggregate these cues into a global video representation, we propose to mine the interactions only among a few selected foreground objects by a Transformer.
arXiv Detail & Related papers (2021-07-22T15:57:18Z)
Articulated Object Interaction in Unknown Scenes with Whole-Body Mobile Manipulation [16.79185733369416]
We propose a two-stage architecture for autonomous interaction with large articulated objects in unknown environments. The first stage uses a learned model to estimate the articulated model of a target object from an RGB-D input and predicts an action-conditional sequence of states for interaction. The second stage comprises of a whole-body motion controller to manipulate the object along the generated kinematic plan.
arXiv Detail & Related papers (2021-03-18T21:32:18Z)
Meta-Reinforcement Learning for Adaptive Motor Control in Changing Robot Dynamics and Environments [3.5309638744466167]
This work developed a meta-learning approach that adapts the control policy on the fly to different changing conditions for robust locomotion. The proposed method constantly updates the interaction model, samples feasible sequences of actions of estimated the state-action trajectories, and then applies the optimal actions to maximize the reward.
arXiv Detail & Related papers (2021-01-19T12:57:12Z)
Goal-Conditioned End-to-End Visuomotor Control for Versatile Skill Primitives [89.34229413345541]
We propose a conditioning scheme which avoids pitfalls by learning the controller and its conditioning in an end-to-end manner. Our model predicts complex action sequences based directly on a dynamic image representation of the robot motion. We report significant improvements in task success over representative MPC and IL baselines.
arXiv Detail & Related papers (2020-03-19T15:04:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.