Generalizable Physics-Informed Learning for Stochastic Safety-Critical Systems
- URL: http://arxiv.org/abs/2407.08868v4
- Date: Sun, 18 Aug 2024 20:37:23 GMT
- Title: Generalizable Physics-Informed Learning for Stochastic Safety-Critical Systems
- Authors: Zhuoyuan Wang, Albert Chern, Yorie Nakahira,
- Abstract summary: We propose an efficient method to evaluate long-term risk probabilities and their gradients using short-term samples without sufficient risk events.
We show in simulation that the proposed technique has improved sample efficiency, generalizes well to unseen regions, and adapts to changing system parameters.
- Score: 8.277567852741244
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate estimate of long-term risk is critical for safe decision-making, but sampling from rare risk events and long-term trajectories can be prohibitively costly. Risk gradient can be used in many first-order techniques for learning and control methods, but gradient estimate is difficult to obtain using Monte Carlo (MC) methods because the infinitesimal divisor may significantly amplify sampling noise. Motivated by this gap, we propose an efficient method to evaluate long-term risk probabilities and their gradients using short-term samples without sufficient risk events. We first derive that four types of long-term risk probability are solutions of certain partial differential equations (PDEs). Then, we propose a physics-informed learning technique that integrates data and physics information (aforementioned PDEs). The physics information helps propagate information beyond available data and obtain provable generalization beyond available data, which in turn enables long-term risk to be estimated using short-term samples of safe events. Finally, we demonstrate in simulation that the proposed technique has improved sample efficiency, generalizes well to unseen regions, and adapts to changing system parameters.
Related papers
- Myopically Verifiable Probabilistic Certificates for Safe Control and Learning [7.6918726072590555]
In environments, set invariance-based methods that restrict the probability of risk events in infinitesimal time intervals may exhibit significant long-term risks.
On the other hand, reachability-based approaches that account for the long-term future may require prohibitive in real-time decision making.
arXiv Detail & Related papers (2024-04-23T20:29:01Z) - Data-Adaptive Tradeoffs among Multiple Risks in Distribution-Free Prediction [55.77015419028725]
We develop methods that permit valid control of risk when threshold and tradeoff parameters are chosen adaptively.
Our methodology supports monotone and nearly-monotone risks, but otherwise makes no distributional assumptions.
arXiv Detail & Related papers (2024-03-28T17:28:06Z) - Physics-informed RL for Maximal Safety Probability Estimation [0.8287206589886881]
We study how to estimate the long-term safety probability of maximally safe actions without sufficient coverage of samples from risky states and long-term trajectories.
The proposed method can also estimate long-term risk using short-term samples and deduce the risk of unsampled states.
arXiv Detail & Related papers (2024-03-25T03:13:56Z) - A Generalizable Physics-informed Learning Framework for Risk Probability Estimation [1.5960546024967326]
We develop an efficient method to evaluate the probabilities of long-term risk and their gradients.
The proposed method exploits the fact that long-term risk probability satisfies certain partial differential equations.
Numerical results show the proposed method has better sample efficiency, generalizes well to unseen regions, and can adapt to systems with changing parameters.
arXiv Detail & Related papers (2023-05-10T19:44:42Z) - Information-Theoretic Safe Exploration with Gaussian Processes [89.31922008981735]
We consider a sequential decision making task where we are not allowed to evaluate parameters that violate an unknown (safety) constraint.
Most current methods rely on a discretization of the domain and cannot be directly extended to the continuous case.
We propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate.
arXiv Detail & Related papers (2022-12-09T15:23:58Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - Distributed Dynamic Safe Screening Algorithms for Sparse Regularization [73.85961005970222]
We propose a new distributed dynamic safe screening (DDSS) method for sparsity regularized models and apply it on shared-memory and distributed-memory architecture respectively.
We prove that the proposed method achieves the linear convergence rate with lower overall complexity and can eliminate almost all the inactive features in a finite number of iterations almost surely.
arXiv Detail & Related papers (2022-04-23T02:45:55Z) - Quantifying Uncertainty in Deep Spatiotemporal Forecasting [67.77102283276409]
We describe two types of forecasting problems: regular grid-based and graph-based.
We analyze UQ methods from both the Bayesian and the frequentist point view, casting in a unified framework via statistical decision theory.
Through extensive experiments on real-world road network traffic, epidemics, and air quality forecasting tasks, we reveal the statistical computational trade-offs for different UQ methods.
arXiv Detail & Related papers (2021-05-25T14:35:46Z) - Towards Safe Policy Improvement for Non-Stationary MDPs [48.9966576179679]
Many real-world problems of interest exhibit non-stationarity, and when stakes are high, the cost associated with a false stationarity assumption may be unacceptable.
We take the first steps towards ensuring safety, with high confidence, for smoothly-varying non-stationary decision problems.
Our proposed method extends a type of safe algorithm, called a Seldonian algorithm, through a synthesis of model-free reinforcement learning with time-series analysis.
arXiv Detail & Related papers (2020-10-23T20:13:51Z) - DeepHazard: neural network for time-varying risks [0.6091702876917281]
We propose a new flexible method for survival prediction: DeepHazard, a neural network for time-varying risks.
Our approach is tailored for a wide range of continuous hazards forms, with the only restriction of being additive in time.
Numerical examples illustrate that our approach outperforms existing state-of-the-art methodology in terms of predictive capability evaluated through the C-index metric.
arXiv Detail & Related papers (2020-07-26T21:01:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.