Transform2Act: Learning a Transform-and-Control Policy for Efficient
Agent Design
- URL: http://arxiv.org/abs/2110.03659v1
- Date: Thu, 7 Oct 2021 17:51:05 GMT
- Title: Transform2Act: Learning a Transform-and-Control Policy for Efficient
Agent Design
- Authors: Ye Yuan, Yuda Song, Zhengyi Luo, Wen Sun, Kris Kitani
- Abstract summary: An agent's functionality is largely determined by its design, i.e., skeletal structure and joint attributes.
Finding the optimal agent design for a given function is extremely challenging since the problem is inherently and the design space is prohibitively large.
To tackle these problems, our key idea is to incorporate the design procedure of an agent into its decision-making process.
- Score: 31.33251581287337
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An agent's functionality is largely determined by its design, i.e., skeletal
structure and joint attributes (e.g., length, size, strength). However, finding
the optimal agent design for a given function is extremely challenging since
the problem is inherently combinatorial and the design space is prohibitively
large. Additionally, it can be costly to evaluate each candidate design which
requires solving for its optimal controller. To tackle these problems, our key
idea is to incorporate the design procedure of an agent into its
decision-making process. Specifically, we learn a conditional policy that, in
an episode, first applies a sequence of transform actions to modify an agent's
skeletal structure and joint attributes, and then applies control actions under
the new design. To handle a variable number of joints across designs, we use a
graph-based policy where each graph node represents a joint and uses message
passing with its neighbors to output joint-specific actions. Using policy
gradient methods, our approach enables first-order optimization of agent design
and control as well as experience sharing across different designs, which
improves sample efficiency tremendously. Experiments show that our approach,
Transform2Act, outperforms prior methods significantly in terms of convergence
speed and final performance. Notably, Transform2Act can automatically discover
plausible designs similar to giraffes, squids, and spiders. Our project website
is at https://sites.google.com/view/transform2act.
Related papers
- PRANCE: Joint Token-Optimization and Structural Channel-Pruning for Adaptive ViT Inference [44.77064952091458]
PRANCE is a Vision Transformer compression framework that jointly optimize the activated channels and reduces tokens, based on the characteristics of inputs.
We introduce a novel "Result-to-Go" training mechanism that models ViTs' inference process as a sequential decision process.
Our framework is shown to be compatible with various token optimization techniques such as pruning, merging, and pruning-merging strategies.
arXiv Detail & Related papers (2024-07-06T09:04:27Z) - Constrained Layout Generation with Factor Graphs [21.07236104467961]
We introduce a factor graph based approach with four latent variable nodes for each room, and a factor node for each constraint.
The factor nodes represent dependencies among the variables to which they are connected, effectively capturing constraints that are potentially of a higher order.
Our approach is simple and generates layouts faithful to the user requirements, demonstrated by a large improvement in IOU scores over existing methods.
arXiv Detail & Related papers (2024-03-30T14:58:40Z) - EulerFormer: Sequential User Behavior Modeling with Complex Vector Attention [88.45459681677369]
We propose a novel transformer variant with complex vector attention, named EulerFormer.
It provides a unified theoretical framework to formulate both semantic difference and positional difference.
It is more robust to semantic variations and possesses moresuperior theoretical properties in principle.
arXiv Detail & Related papers (2024-03-26T14:18:43Z) - Compositional Generative Inverse Design [69.22782875567547]
Inverse design, where we seek to design input variables in order to optimize an underlying objective function, is an important problem.
We show that by instead optimizing over the learned energy function captured by the diffusion model, we can avoid such adversarial examples.
In an N-body interaction task and a challenging 2D multi-airfoil design task, we demonstrate that by composing the learned diffusion model at test time, our method allows us to design initial states and boundary shapes.
arXiv Detail & Related papers (2024-01-24T01:33:39Z) - SGTR+: End-to-end Scene Graph Generation with Transformer [42.396971149458324]
Scene Graph Generation (SGG) remains a challenging visual understanding task due to its compositional property.
Most previous works adopt a bottom-up, two-stage or point-based, one-stage approach, which often suffers from high time complexity or suboptimal designs.
We propose a novel SGG method to address the aforementioned issues, formulating the task as a bipartite graph construction problem.
arXiv Detail & Related papers (2024-01-23T15:18:20Z) - Efficient Automatic Machine Learning via Design Graphs [72.85976749396745]
We propose FALCON, an efficient sample-based method to search for the optimal model design.
FALCON features 1) a task-agnostic module, which performs message passing on the design graph via a Graph Neural Network (GNN), and 2) a task-specific module, which conducts label propagation of the known model performance information.
We empirically show that FALCON can efficiently obtain the well-performing designs for each task using only 30 explored nodes.
arXiv Detail & Related papers (2022-10-21T21:25:59Z) - Meta Reinforcement Learning for Optimal Design of Legged Robots [9.054187238463212]
We present a design optimization framework using model-free meta reinforcement learning.
We show that our approach allows higher performance while not being constrained by predefined motions or gait patterns.
arXiv Detail & Related papers (2022-10-06T08:37:52Z) - An End-to-End Differentiable Framework for Contact-Aware Robot Design [37.715596272425316]
We build an end-to-end differentiable framework for contact-aware robot design.
A novel deformation-based parameterization allows for the design of articulated rigid robots with arbitrary, complex geometry.
A differentiable rigid body simulator can handle contact-rich scenarios and computes analytical gradients for a full spectrum of kinematic and dynamic parameters.
arXiv Detail & Related papers (2021-07-15T17:53:44Z) - Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model [58.17021225930069]
We explain the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA)
We propose a more efficient EAT model, and design task-related heads to deal with different tasks more flexibly.
Our approach achieves state-of-the-art results on the ImageNet classification task compared with recent vision transformer works.
arXiv Detail & Related papers (2021-05-31T16:20:03Z) - A Flexible Framework for Designing Trainable Priors with Adaptive
Smoothing and Game Encoding [57.1077544780653]
We introduce a general framework for designing and training neural network layers whose forward passes can be interpreted as solving non-smooth convex optimization problems.
We focus on convex games, solved by local agents represented by the nodes of a graph and interacting through regularization functions.
This approach is appealing for solving imaging problems, as it allows the use of classical image priors within deep models that are trainable end to end.
arXiv Detail & Related papers (2020-06-26T08:34:54Z) - Joint Multi-Dimension Pruning via Numerical Gradient Update [120.59697866489668]
We present joint multi-dimension pruning (abbreviated as JointPruning), an effective method of pruning a network on three crucial aspects: spatial, depth and channel simultaneously.
We show that our method is optimized collaboratively across the three dimensions in a single end-to-end training and it is more efficient than the previous exhaustive methods.
arXiv Detail & Related papers (2020-05-18T17:57:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.