An Actor-Critic Method for Simulation-Based Optimization
- URL: http://arxiv.org/abs/2111.00435v1
- Date: Sun, 31 Oct 2021 09:04:23 GMT
- Title: An Actor-Critic Method for Simulation-Based Optimization
- Authors: Kuo Li, Qing-Shan Jia, Jiaqi Yan
- Abstract summary: We focus on a simulation-based optimization problem of choosing the best design from the feasible space.
We formulate the sampling process as a policy searching problem and give a solution from the perspective of Reinforcement Learning (RL)
Some experiments are designed to validate the effectiveness of proposed algorithms.
- Score: 6.261751912603047
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We focus on a simulation-based optimization problem of choosing the best
design from the feasible space. Although the simulation model can be queried
with finite samples, its internal processing rule cannot be utilized in the
optimization process. We formulate the sampling process as a policy searching
problem and give a solution from the perspective of Reinforcement Learning
(RL). Concretely, Actor-Critic (AC) framework is applied, where the Actor
serves as a surrogate model to predict the performance on unknown designs,
whereas the actor encodes the sampling policy to be optimized. We design the
updating rule and propose two algorithms for the cases where the feasible
spaces are continuous and discrete respectively. Some experiments are designed
to validate the effectiveness of proposed algorithms, including two toy
examples, which intuitively explain the algorithms, and two more complex tasks,
i.e., adversarial attack task and RL task, which validate the effectiveness in
large-scale problems. The results show that the proposed algorithms can
successfully deal with these problems. Especially note that in the RL task, our
methods give a new perspective to robot control by treating the task as a
simulation model and solving it by optimizing the policy generating process,
while existing works commonly optimize the policy itself directly.
Related papers
- Primitive Agentic First-Order Optimization [0.0]
This work presents a proof-of-concept study combining primitive state representations and agent-environment interactions as first-order reinforcement learning.
The results show that elementary RL methods combined with succinct partial state representations can be used as optimizeds manage complexity in RL-based optimization.
arXiv Detail & Related papers (2024-06-07T11:13:38Z) - Model Uncertainty in Evolutionary Optimization and Bayesian Optimization: A Comparative Analysis [5.6787965501364335]
Black-box optimization problems are common in many real-world applications.
These problems require optimization through input-output interactions without access to internal workings.
Two widely used gradient-free optimization techniques are employed to address such challenges.
This paper aims to elucidate the similarities and differences in the utilization of model uncertainty between these two methods.
arXiv Detail & Related papers (2024-03-21T13:59:19Z) - Analyzing and Enhancing the Backward-Pass Convergence of Unrolled
Optimization [50.38518771642365]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
A central challenge in this setting is backpropagation through the solution of an optimization problem, which often lacks a closed form.
This paper provides theoretical insights into the backward pass of unrolled optimization, showing that it is equivalent to the solution of a linear system by a particular iterative method.
A system called Folded Optimization is proposed to construct more efficient backpropagation rules from unrolled solver implementations.
arXiv Detail & Related papers (2023-12-28T23:15:18Z) - Efficient Inverse Design Optimization through Multi-fidelity Simulations, Machine Learning, and Search Space Reduction Strategies [0.8646443773218541]
This paper introduces a methodology designed to augment the inverse design optimization process in scenarios constrained by limited compute.
The proposed methodology is analyzed on two distinct engineering inverse design problems: airfoil inverse design and the scalar field reconstruction problem.
Notably, this method is adaptable across any inverse design application, facilitating a synergy between a representative low-fidelity ML model, and high-fidelity simulation, and can be seamlessly applied across any variety of population-based optimization algorithms.
arXiv Detail & Related papers (2023-12-06T18:20:46Z) - DADO -- Low-Cost Query Strategies for Deep Active Design Optimization [1.6298921134113031]
We present two selection strategies for self-optimization to reduce the computational cost in multi-objective design optimization problems.
We evaluate our strategies on a large dataset from the domain of fluid dynamics and introduce two new evaluation metrics to determine the model's performance.
arXiv Detail & Related papers (2023-07-10T13:01:27Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver.
This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z) - Multi-Objective Policy Gradients with Topological Constraints [108.10241442630289]
We present a new algorithm for a policy gradient in TMDPs by a simple extension of the proximal policy optimization (PPO) algorithm.
We demonstrate this on a real-world multiple-objective navigation problem with an arbitrary ordering of objectives both in simulation and on a real robot.
arXiv Detail & Related papers (2022-09-15T07:22:58Z) - Optimizing Sequential Experimental Design with Deep Reinforcement
Learning [7.589363597086081]
We show that the problem of optimizing policies can be reduced to solving a Markov decision process (MDP)
Our approach is also computationally efficient at deployment time and exhibits state-of-the-art performance on both continuous and discrete design spaces.
arXiv Detail & Related papers (2022-02-02T00:23:05Z) - A Two-stage Framework and Reinforcement Learning-based Optimization
Algorithms for Complex Scheduling Problems [54.61091936472494]
We develop a two-stage framework, in which reinforcement learning (RL) and traditional operations research (OR) algorithms are combined together.
The scheduling problem is solved in two stages, including a finite Markov decision process (MDP) and a mixed-integer programming process, respectively.
Results show that the proposed algorithms could stably and efficiently obtain satisfactory scheduling schemes for agile Earth observation satellite scheduling problems.
arXiv Detail & Related papers (2021-03-10T03:16:12Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.