Rodrigues Network for Learning Robot Actions
- URL: http://arxiv.org/abs/2506.02618v1
- Date: Tue, 03 Jun 2025 08:34:06 GMT
- Title: Rodrigues Network for Learning Robot Actions
- Authors: Jialiang Zhang, Haoran Geng, Yang You, Congyue Deng, Pieter Abbeel, Jitendra Malik, Leonidas Guibas,
- Abstract summary: We propose the Neural Rodrigues Operator to inject kinematics-aware inductive bias into neural computation.<n>We design the Rodrigues Network (RodriNet), a novel neural architecture specialized for processing actions.<n>Our results suggest that integrating structured kinematic priors into the network architecture improves action learning in various domains.
- Score: 76.69283501115855
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding and predicting articulated actions is important in robot learning. However, common architectures such as MLPs and Transformers lack inductive biases that reflect the underlying kinematic structure of articulated systems. To this end, we propose the Neural Rodrigues Operator, a learnable generalization of the classical forward kinematics operation, designed to inject kinematics-aware inductive bias into neural computation. Building on this operator, we design the Rodrigues Network (RodriNet), a novel neural architecture specialized for processing actions. We evaluate the expressivity of our network on two synthetic tasks on kinematic and motion prediction, showing significant improvements compared to standard backbones. We further demonstrate its effectiveness in two realistic applications: (i) imitation learning on robotic benchmarks with the Diffusion Policy, and (ii) single-image 3D hand reconstruction. Our results suggest that integrating structured kinematic priors into the network architecture improves action learning in various domains.
Related papers
- Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation [88.83749146867665]
Existing approaches learn a policy to predict a distant next-best end-effector pose.<n>They then compute the corresponding joint rotation angles for motion using inverse kinematics.<n>We propose Kinematics enhanced Spatial-TemporAl gRaph diffuser.
arXiv Detail & Related papers (2025-03-13T17:48:35Z) - Deconstructing Recurrence, Attention, and Gating: Investigating the transferability of Transformers and Gated Recurrent Neural Networks in forecasting of dynamical systems [0.0]
We decompose the key architectural components of the most powerful neural architectures, namely gating and recurrence in RNNs, and attention mechanisms in transformers.
A key finding is that neural gating and attention improves the accuracy of all standard RNNs in most tasks, while the addition of a notion of recurrence in transformers is detrimental.
arXiv Detail & Related papers (2024-10-03T16:41:51Z) - Body Transformer: Leveraging Robot Embodiment for Policy Learning [51.531793239586165]
Body Transformer (BoT) is an architecture that leverages the robot embodiment by providing an inductive bias that guides the learning process.
We represent the robot body as a graph of sensors and actuators, and rely on masked attention to pool information throughout the architecture.
The resulting architecture outperforms the vanilla transformer, as well as the classical multilayer perceptron, in terms of task completion, scaling properties, and computational efficiency.
arXiv Detail & Related papers (2024-08-12T17:31:28Z) - Bidirectional Progressive Neural Networks with Episodic Return Progress
for Emergent Task Sequencing and Robotic Skill Transfer [1.7205106391379026]
We introduce a novel multi-task reinforcement learning framework named Episodic Return Progress with Bidirectional Progressive Neural Networks (ERP-BPNN)
The proposed ERP-BPNN model learns in a human-like interleaved manner by (2) autonomous task switching based on a novel intrinsic motivation signal.
We show that ERP-BPNN achieves faster cumulative convergence and improves performance in all metrics considered among morphologically different robots compared to the baselines.
arXiv Detail & Related papers (2024-03-06T19:17:49Z) - Mechanistic Neural Networks for Scientific Machine Learning [58.99592521721158]
We present Mechanistic Neural Networks, a neural network design for machine learning applications in the sciences.
It incorporates a new Mechanistic Block in standard architectures to explicitly learn governing differential equations as representations.
Central to our approach is a novel Relaxed Linear Programming solver (NeuRLP) inspired by a technique that reduces solving linear ODEs to solving linear programs.
arXiv Detail & Related papers (2024-02-20T15:23:24Z) - Convolution, aggregation and attention based deep neural networks for
accelerating simulations in mechanics [1.0154623955833253]
We demonstrate three types of neural network architectures for efficient learning of deformations of solid bodies.
The first two are based on the recently proposed CNN U-NET and MAgNET frameworks which have shown promising performance for learning on mesh-based data.
The third architecture is Perceiver IO, a very recent architecture that belongs to the family of attention-based neural networks.
arXiv Detail & Related papers (2022-12-01T13:10:56Z) - PACT: Perception-Action Causal Transformer for Autoregressive Robotics
Pre-Training [25.50131893785007]
This work introduces a paradigm for pre-training a general purpose representation that can serve as a starting point for multiple tasks on a given robot.
We present the Perception-Action Causal Transformer (PACT), a generative transformer-based architecture that aims to build representations directly from robot data in a self-supervised fashion.
We show that finetuning small task-specific networks on top of the larger pretrained model results in significantly better performance compared to training a single model from scratch for all tasks simultaneously.
arXiv Detail & Related papers (2022-09-22T16:20:17Z) - Graph Neural Networks for Relational Inductive Bias in Vision-based Deep
Reinforcement Learning of Robot Control [0.0]
This work introduces a neural network architecture that combines relational inductive bias and visual feedback to learn an efficient position control policy.
We derive a graph representation that models the robot's internal state with a low-dimensional description of the visual scene generated by an image encoding network.
We show the ability of the model to improve sample efficiency for a 6-DoF robot arm in a visually realistic 3D environment.
arXiv Detail & Related papers (2022-03-11T15:11:54Z) - Visionary: Vision architecture discovery for robot learning [58.67846907923373]
We propose a vision-based architecture search algorithm for robot manipulation learning, which discovers interactions between low dimension action inputs and high dimensional visual inputs.
Our approach automatically designs architectures while training on the task - discovering novel ways of combining and attending image feature representations with actions as well as features from previous layers.
arXiv Detail & Related papers (2021-03-26T17:51:43Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.