Dimensionless Policies based on the Buckingham $\pi$ Theorem: Is This a
Good Way to Generalize Numerical Results?
- URL: http://arxiv.org/abs/2307.15852v2
- Date: Wed, 28 Feb 2024 21:52:19 GMT
- Title: Dimensionless Policies based on the Buckingham $\pi$ Theorem: Is This a
Good Way to Generalize Numerical Results?
- Authors: Alexandre Girard
- Abstract summary: This article explores the use of the Buckingham $pi$ theorem as a tool to encode the control policies of physical systems into a generic form of knowledge.
We show, by restating the solution to a motion control problem using dimensionless variables, that (1) the policy mapping involves a reduced number of parameters and (2) control policies generated numerically for a specific system can be transferred exactly to a subset of dimensionally similar systems by scaling the input and output variables appropriately.
It remains to be seen how practical this approach can be to generalize policies for more complex high-dimensional problems, but the early results show that it is a
- Score: 66.52698983694613
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The answer to the question posed in the title is yes if the context (the list
of variables defining the motion control problem) is dimensionally similar.
This article explores the use of the Buckingham $\pi$ theorem as a tool to
encode the control policies of physical systems into a more generic form of
knowledge that can be reused in various situations. This approach can be
interpreted as enforcing invariance to the scaling of the fundamental units in
an algorithm learning a control policy. First, we show, by restating the
solution to a motion control problem using dimensionless variables, that (1)
the policy mapping involves a reduced number of parameters and (2) control
policies generated numerically for a specific system can be transferred exactly
to a subset of dimensionally similar systems by scaling the input and output
variables appropriately. Those two generic theoretical results are then
demonstrated, with numerically generated optimal controllers, for the classic
motion control problem of swinging up a torque-limited inverted pendulum and
positioning a vehicle in slippery conditions. We also discuss the concept of
regime, a region in the space of context variables, that can help to relax the
similarity condition. Furthermore, we discuss how applying dimensional scaling
of the input and output of a context-specific black-box policy is equivalent to
substituting new system parameters in an analytical equation under some
conditions, using a linear quadratic regulator (LQR) and a computed torque
controller as examples. It remains to be seen how practical this approach can
be to generalize policies for more complex high-dimensional problems, but the
early results show that it is a promising transfer learning tool for numerical
approaches like dynamic programming and reinforcement learning.
Related papers
- Synthesizing Interpretable Control Policies through Large Language Model Guided Search [7.706225175516503]
We represent control policies as programs in standard languages like Python.
We evaluate candidate controllers in simulation and evolve them using a pre-trained LLM.
We illustrate our method through its application to the synthesis of an interpretable control policy for the pendulum swing-up and the ball in cup tasks.
arXiv Detail & Related papers (2024-10-07T18:12:20Z) - Neural Time-Reversed Generalized Riccati Equation [60.92253836775246]
Hamiltonian equations offer an interpretation of optimality through auxiliary variables known as costates.
This paper introduces a novel neural-based approach to optimal control, with the aim of working forward-in-time.
arXiv Detail & Related papers (2023-12-14T19:29:37Z) - Conformal Policy Learning for Sensorimotor Control Under Distribution
Shifts [61.929388479847525]
This paper focuses on the problem of detecting and reacting to changes in the distribution of a sensorimotor controller's observables.
The key idea is the design of switching policies that can take conformal quantiles as input.
We show how to design such policies by using conformal quantiles to switch between base policies with different characteristics.
arXiv Detail & Related papers (2023-11-02T17:59:30Z) - A Physics-informed Deep Learning Approach for Minimum Effort Stochastic
Control of Colloidal Self-Assembly [9.791617215182598]
The control objective is formulated in terms of steering the state PDFs from a prescribed initial probability measure towards a prescribed terminal probability measure with minimum control effort.
We derive the conditions of optimality for the associated optimal control problem.
The performance of the proposed solution is demonstrated via numerical simulations on a benchmark colloidal self-assembly problem.
arXiv Detail & Related papers (2022-08-19T07:01:57Z) - A Recursive Partitioning Approach for Dynamic Discrete Choice Modeling
in High Dimensional Settings [0.0]
estimation of dynamic discrete choice models is often computationally intensive and/or infeasible in high-dimensional settings.
We present a semi-parametric formulation of dynamic discrete choice models that incorporates a high-dimensional set of state variables.
arXiv Detail & Related papers (2022-08-02T14:13:25Z) - Deep Learning Approximation of Diffeomorphisms via Linear-Control
Systems [91.3755431537592]
We consider a control system of the form $dot x = sum_i=1lF_i(x)u_i$, with linear dependence in the controls.
We use the corresponding flow to approximate the action of a diffeomorphism on a compact ensemble of points.
arXiv Detail & Related papers (2021-10-24T08:57:46Z) - Continuous-Time Fitted Value Iteration for Robust Policies [93.25997466553929]
Solving the Hamilton-Jacobi-Bellman equation is important in many domains including control, robotics and economics.
We propose continuous fitted value iteration (cFVI) and robust fitted value iteration (rFVI)
These algorithms leverage the non-linear control-affine dynamics and separable state and action reward of many continuous control problems.
arXiv Detail & Related papers (2021-10-05T11:33:37Z) - Policy Optimization for Linear-Quadratic Zero-Sum Mean-Field Type Games [1.1852406625172216]
zero-sum mean-field type games (ZSMFTG) with linear dynamics and quadratic utility are studied.
Two policy optimization methods that rely on policy gradient are proposed.
arXiv Detail & Related papers (2020-09-02T13:49:08Z) - Sparse Identification of Nonlinear Dynamical Systems via Reweighted
$\ell_1$-regularized Least Squares [62.997667081978825]
This work proposes an iterative sparse-regularized regression method to recover governing equations of nonlinear systems from noisy state measurements.
The aim of this work is to improve the accuracy and robustness of the method in the presence of state measurement noise.
arXiv Detail & Related papers (2020-05-27T08:30:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.