MORPH: Design Co-optimization with Reinforcement Learning via a
Differentiable Hardware Model Proxy
- URL: http://arxiv.org/abs/2309.17227v1
- Date: Fri, 29 Sep 2023 13:25:45 GMT
- Title: MORPH: Design Co-optimization with Reinforcement Learning via a
Differentiable Hardware Model Proxy
- Authors: Zhanpeng He and Matei Ciocarlie
- Abstract summary: We introduce MORPH, a method for co-optimization of hardware design parameters and control policies in simulation using reinforcement learning.
We demonstrate our approach on simulated 2D reaching and 3D multi-fingered manipulation tasks.
- Score: 3.4265828682659705
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We introduce MORPH, a method for co-optimization of hardware design
parameters and control policies in simulation using reinforcement learning.
Like most co-optimization methods, MORPH relies on a model of the hardware
being optimized, usually simulated based on the laws of physics. However, such
a model is often difficult to integrate into an effective optimization routine.
To address this, we introduce a proxy hardware model, which is always
differentiable and enables efficient co-optimization alongside a long-horizon
control policy using RL. MORPH is designed to ensure that the optimized
hardware proxy remains as close as possible to its realistic counterpart, while
still enabling task completion. We demonstrate our approach on simulated 2D
reaching and 3D multi-fingered manipulation tasks.
Related papers
- Decoding-Time Language Model Alignment with Multiple Objectives [88.64776769490732]
Existing methods primarily focus on optimizing LMs for a single reward function, limiting their adaptability to varied objectives.
Here, we propose $textbfmulti-objective decoding (MOD)$, a decoding-time algorithm that outputs the next token from a linear combination of predictions.
We show why existing approaches can be sub-optimal even in natural settings and obtain optimality guarantees for our method.
arXiv Detail & Related papers (2024-06-27T02:46:30Z) - Two Optimizers Are Better Than One: LLM Catalyst Empowers Gradient-Based Optimization for Prompt Tuning [69.95292905263393]
We show that gradient-based optimization and large language models (MsLL) are complementary to each other, suggesting a collaborative optimization approach.
Our code is released at https://www.guozix.com/guozix/LLM-catalyst.
arXiv Detail & Related papers (2024-05-30T06:24:14Z) - Track Everything Everywhere Fast and Robustly [46.362962852140015]
We propose a novel test-time optimization approach for efficiently tracking any pixel in a video.
We introduce a novel invertible deformation network, CaDeX++, which factorizes the function representation into a local spatial-temporal feature grid.
Our experiments demonstrate a substantial improvement in training speed (more than textbf10 times faster), robustness, and accuracy in tracking over the SoTA optimization-based method OmniMotion.
arXiv Detail & Related papers (2024-03-26T17:58:22Z) - Efficient Inverse Design Optimization through Multi-fidelity Simulations, Machine Learning, and Search Space Reduction Strategies [0.8646443773218541]
This paper introduces a methodology designed to augment the inverse design optimization process in scenarios constrained by limited compute.
The proposed methodology is analyzed on two distinct engineering inverse design problems: airfoil inverse design and the scalar field reconstruction problem.
Notably, this method is adaptable across any inverse design application, facilitating a synergy between a representative low-fidelity ML model, and high-fidelity simulation, and can be seamlessly applied across any variety of population-based optimization algorithms.
arXiv Detail & Related papers (2023-12-06T18:20:46Z) - Agent-based Collaborative Random Search for Hyper-parameter Tuning and
Global Function Optimization [0.0]
This paper proposes an agent-based collaborative technique for finding near-optimal values for any arbitrary set of hyper- parameters in a machine learning model.
The behavior of the presented model, specifically against the changes in its design parameters, is investigated in both machine learning and global function optimization applications.
arXiv Detail & Related papers (2023-03-03T21:10:17Z) - Slapo: A Schedule Language for Progressive Optimization of Large Deep
Learning Model Training [17.556432199389615]
Slapo is a schedule language that decouples the execution of a tensor-level operator from its arithmetic definition.
We show that Slapo can improve training throughput by up to 2.92x on a single machine with 8 NVIDIA V100 GPUs.
arXiv Detail & Related papers (2023-02-16T00:34:53Z) - An Empirical Evaluation of Zeroth-Order Optimization Methods on
AI-driven Molecule Optimization [78.36413169647408]
We study the effectiveness of various ZO optimization methods for optimizing molecular objectives.
We show the advantages of ZO sign-based gradient descent (ZO-signGD)
We demonstrate the potential effectiveness of ZO optimization methods on widely used benchmark tasks from the Guacamol suite.
arXiv Detail & Related papers (2022-10-27T01:58:10Z) - Meta Reinforcement Learning for Optimal Design of Legged Robots [9.054187238463212]
We present a design optimization framework using model-free meta reinforcement learning.
We show that our approach allows higher performance while not being constrained by predefined motions or gait patterns.
arXiv Detail & Related papers (2022-10-06T08:37:52Z) - Accelerated Federated Learning with Decoupled Adaptive Optimization [53.230515878096426]
federated learning (FL) framework enables clients to collaboratively learn a shared model while keeping privacy of training data on clients.
Recently, many iterations efforts have been made to generalize centralized adaptive optimization methods, such as SGDM, Adam, AdaGrad, etc., to federated settings.
This work aims to develop novel adaptive optimization methods for FL from the perspective of dynamics of ordinary differential equations (ODEs)
arXiv Detail & Related papers (2022-07-14T22:46:43Z) - Efficient Differentiable Simulation of Articulated Bodies [89.64118042429287]
We present a method for efficient differentiable simulation of articulated bodies.
This enables integration of articulated body dynamics into deep learning frameworks.
We show that reinforcement learning with articulated systems can be accelerated using gradients provided by our method.
arXiv Detail & Related papers (2021-09-16T04:48:13Z) - Optimization-Inspired Learning with Architecture Augmentations and
Control Mechanisms for Low-Level Vision [74.9260745577362]
This paper proposes a unified optimization-inspired learning framework to aggregate Generative, Discriminative, and Corrective (GDC) principles.
We construct three propagative modules to effectively solve the optimization models with flexible combinations.
Experiments across varied low-level vision tasks validate the efficacy and adaptability of GDC.
arXiv Detail & Related papers (2020-12-10T03:24:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.