Related papers: Robust Policy Search for Robot Navigation with Stochastic Meta-Policies

Robust Policy Search for Robot Navigation with Stochastic Meta-Policies

URL: http://arxiv.org/abs/2003.01000v1
Date: Mon, 2 Mar 2020 16:30:59 GMT
Title: Robust Policy Search for Robot Navigation with Stochastic Meta-Policies
Authors: Javier Garcia-Barcos, Ruben Martinez-Cantin
Abstract summary: In this work, we exploit the main ingredients of Bayesian optimization to provide robustness to different issues for policy search algorithms. We combine several methods and show how their interaction works better than the sum of the parts. We compare the proposed algorithm with previous results in several optimization benchmarks and robot tasks, such as pushing objects with a robot arm, or path finding with a rover.
Score: 5.7871177330714145
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Bayesian optimization is an efficient nonlinear optimization method where the queries are carefully selected to gather information about the optimum location. Thus, in the context of policy search, it has been called active policy search. The main ingredients of Bayesian optimization for sample efficiency are the probabilistic surrogate model and the optimal decision heuristics. In this work, we exploit those to provide robustness to different issues for policy search algorithms. We combine several methods and show how their interaction works better than the sum of the parts. First, to deal with input noise and provide a safe and repeatable policy we use an improved version of unscented Bayesian optimization. Then, to deal with mismodeling errors and improve exploration we use stochastic meta-policies for query selection and an adaptive kernel. We compare the proposed algorithm with previous results in several optimization benchmarks and robot tasks, such as pushing objects with a robot arm, or path finding with a rover.

Related papers

Robust Entropy Search for Safe Efficient Bayesian Optimization [40.56709991743249]
We develop an efficient information-based acquisition function that we call Robust Entropy Search (RES) RES reliably finds robust optima, outperforming state-of-the-art algorithms.
arXiv Detail & Related papers (2024-05-29T13:00:10Z)
End-to-End Learning for Fair Multiobjective Optimization Under Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality. This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives. It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z)
Towards Efficient Exact Optimization of Language Model Alignment [93.39181634597877]
Direct preference optimization (DPO) was proposed to directly optimize the policy from preference data. We show that DPO derived based on the optimal solution of problem leads to a compromised mean-seeking approximation of the optimal solution in practice. We propose efficient exact optimization (EXO) of the alignment objective.
arXiv Detail & Related papers (2024-02-01T18:51:54Z)
Towards Safe Multi-Task Bayesian Optimization [1.3654846342364308]
Reduced physical models of the system can be incorporated into the optimization process, accelerating it. These models are able to offer an approximation of the actual system, and evaluating them is significantly cheaper. Safety is a crucial criterion for online optimization methods such as Bayesian optimization.
arXiv Detail & Related papers (2023-12-12T13:59:26Z)
Acceleration in Policy Optimization [50.323182853069184]
We work towards a unifying paradigm for accelerating policy optimization methods in reinforcement learning (RL) by integrating foresight in the policy improvement step via optimistic and adaptive updates. We define optimism as predictive modelling of the future behavior of a policy, and adaptivity as taking immediate and anticipatory corrective actions to mitigate errors from overshooting predictions or delayed responses to change. We design an optimistic policy gradient algorithm, adaptive via meta-gradient learning, and empirically highlight several design choices pertaining to acceleration, in an illustrative task.
arXiv Detail & Related papers (2023-06-18T15:50:57Z)
Wasserstein Gradient Flows for Optimizing Gaussian Mixture Policies [0.0]
Policy optimization is the emphde facto paradigm to adapt robot policies as a function of task-specific objectives. We propose to leverage the structure of probabilistic policies by casting the policy optimization as an optimal transport problem. We evaluate our approach on common robotic settings: reaching motions, collision-avoidance behaviors, and multi-goal tasks.
arXiv Detail & Related papers (2023-05-17T17:48:24Z)
Generalizing Bayesian Optimization with Decision-theoretic Entropies [102.82152945324381]
We consider a generalization of Shannon entropy from work in statistical decision theory. We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures. We then show how alternative choices for the loss yield a flexible family of acquisition functions.
arXiv Detail & Related papers (2022-10-04T04:43:58Z)
Efficient Non-Parametric Optimizer Search for Diverse Tasks [93.64739408827604]
We present the first efficient scalable and general framework that can directly search on the tasks of interest. Inspired by the innate tree structure of the underlying math expressions, we re-arrange the spaces into a super-tree. We adopt an adaptation of the Monte Carlo method to tree search, equipped with rejection sampling and equivalent- form detection.
arXiv Detail & Related papers (2022-09-27T17:51:31Z)
Bayesian Optimization with Informative Covariance [13.113313427848828]
We propose novel informative covariance functions for optimization, leveraging nonstationarity to encode preferences for certain regions of the search space. We demonstrate that the proposed functions can increase the sample efficiency of Bayesian optimization in high dimensions, even under weak prior information.
arXiv Detail & Related papers (2022-08-04T15:05:11Z)
Tensor Train for Global Optimization Problems in Robotics [6.702251803443858]
The convergence of many numerical optimization techniques is highly dependent on the initial guess given to the solver. We propose a novel approach that utilizes methods to initialize existing optimization solvers near global optima. We show that the proposed method can generate samples close to global optima and from multiple modes.
arXiv Detail & Related papers (2022-06-10T13:18:26Z)
Dimensionality Reduction and Prioritized Exploration for Policy Search [29.310742141970394]
Black-box policy optimization is a class of reinforcement learning algorithms that explores and updates the policies at the parameter level. We present a novel method to prioritize the exploration of effective parameters and cope with full covariance matrix updates. Our algorithm learns faster than recent approaches and requires fewer samples to achieve state-of-the-art results.
arXiv Detail & Related papers (2022-03-09T15:17:09Z)
A Robust Multi-Objective Bayesian Optimization Framework Considering Input Uncertainty [0.0]
In real-life applications like engineering design, the designer often wants to take multiple objectives as well as input uncertainty into account. We introduce a novel Bayesian optimization framework to efficiently perform multi-objective optimization considering input uncertainty.
arXiv Detail & Related papers (2022-02-25T17:45:26Z)
Bayesian Optimization for auto-tuning GPU kernels [0.0]
Finding optimal parameter configurations for GPU kernels is a non-trivial exercise for large search spaces, even when automated. We introduce a novel contextual exploration factor as well as new acquisition functions with improved scalability, combined with an informed function selection mechanism.
arXiv Detail & Related papers (2021-11-26T11:26:26Z)
Understanding the Effect of Stochasticity in Policy Optimization [86.7574122154668]
We show that the preferability of optimization methods depends critically on whether exact gradients are used. Second, to explain these findings we introduce the concept of committal rate for policy optimization. Third, we show that in the absence of external oracle information, there is an inherent trade-off between exploiting geometry to accelerate convergence versus achieving optimality almost surely.
arXiv Detail & Related papers (2021-10-29T06:35:44Z)
Approximate Bayesian Optimisation for Neural Networks [6.921210544516486]
A body of work has been done to automate machine learning algorithm to highlight the importance of model choice. The necessity to solve the analytical tractability and the computational feasibility in a idealistic fashion enables to ensure the efficiency and the applicability.
arXiv Detail & Related papers (2021-08-27T19:03:32Z)
Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment. Policy gradients for local search are often obtained from random perturbations. We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z)
Bayesian Optimization for Selecting Efficient Machine Learning Models [53.202224677485525]
We present a unified Bayesian Optimization framework for jointly optimizing models for both prediction effectiveness and training efficiency. Experiments on model selection for recommendation tasks indicate models selected this way significantly improves model training efficiency.
arXiv Detail & Related papers (2020-08-02T02:56:30Z)
Bilevel Optimization for Differentially Private Optimization in Energy Systems [53.806512366696275]
This paper studies how to apply differential privacy to constrained optimization problems whose inputs are sensitive. The paper shows that, under a natural assumption, a bilevel model can be solved efficiently for large-scale nonlinear optimization problems.
arXiv Detail & Related papers (2020-01-26T20:15:28Z)
Provably Efficient Exploration in Policy Optimization [117.09887790160406]
This paper proposes an Optimistic variant of the Proximal Policy Optimization algorithm (OPPO) OPPO achieves $tildeO(sqrtd2 H3 T )$ regret. To the best of our knowledge, OPPO is the first provably efficient policy optimization algorithm that explores.
arXiv Detail & Related papers (2019-12-12T08:40:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.