SafePILCO: a software tool for safe and data-efficient policy synthesis
- URL: http://arxiv.org/abs/2008.03273v1
- Date: Fri, 7 Aug 2020 17:17:30 GMT
- Title: SafePILCO: a software tool for safe and data-efficient policy synthesis
- Authors: Kyriakos Polymenakos, Nikitas Rontsis, Alessandro Abate and Stephen
Roberts
- Abstract summary: SafePILCO is a software tool for safe and data-efficient policy search with reinforcement learning.
It extends the known PILCO algorithm, originally written in Python, to support safe learning.
- Score: 67.17251247987187
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: SafePILCO is a software tool for safe and data-efficient policy search with
reinforcement learning. It extends the known PILCO algorithm, originally
written in MATLAB, to support safe learning. We provide a Python implementation
and leverage existing libraries that allow the codebase to remain short and
modular, which is appropriate for wider use by the verification, reinforcement
learning, and control communities.
Related papers
- A Comprehensive Guide to Combining R and Python code for Data Science, Machine Learning and Reinforcement Learning [42.350737545269105]
We show how to run Python's scikit-learn, pytorch and OpenAI gym libraries for building Machine Learning, Deep Learning, and Reinforcement Learning projects easily.
arXiv Detail & Related papers (2024-07-19T23:01:48Z) - Python Fuzzing for Trustworthy Machine Learning Frameworks [0.0]
We propose a dynamic analysis pipeline for Python projects using Sydr-Fuzz.
Our pipeline includes fuzzing, corpus minimization, crash triaging, and coverage collection.
To identify the most vulnerable parts of machine learning frameworks, we analyze their potential attack surfaces and develop fuzz targets for PyTorch, and related projects such as h5py.
arXiv Detail & Related papers (2024-03-19T13:41:11Z) - SequeL: A Continual Learning Library in PyTorch and JAX [50.33956216274694]
SequeL is a library for Continual Learning that supports both PyTorch and JAX frameworks.
It provides a unified interface for a wide range of Continual Learning algorithms, including regularization-based approaches, replay-based approaches, and hybrid approaches.
We release SequeL as an open-source library, enabling researchers and developers to easily experiment and extend the library for their own purposes.
arXiv Detail & Related papers (2023-04-21T10:00:22Z) - Machine Learning Based Approach to Recommend MITRE ATT&CK Framework for
Software Requirements and Design Specifications [0.0]
To develop secure software, software developers need to think like an attacker through mining software repositories.
In this paper, we use machine learning algorithms to map requirements to the MITRE ATT&CK database.
arXiv Detail & Related papers (2023-02-10T22:15:45Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Provable Safe Reinforcement Learning with Binary Feedback [62.257383728544006]
We consider the problem of provable safe RL when given access to an offline oracle providing binary feedback on the safety of state, action pairs.
We provide a novel meta algorithm, SABRE, which can be applied to any MDP setting given access to a blackbox PAC RL algorithm for that setting.
arXiv Detail & Related papers (2022-10-26T05:37:51Z) - problexity -- an open-source Python library for binary classification
problem complexity assessment [0.0]
The classification problem's complexity assessment is an essential element of many topics in the supervised learning domain.
The tools currently available for the academic community, which would enable the calculation of problem complexity measures, are available only as libraries of the C++ and R languages.
This paper describes the software module that allows for the estimation of 22 complexity measures for the Python language.
arXiv Detail & Related papers (2022-07-14T07:32:15Z) - MRCpy: A Library for Minimax Risk Classifiers [10.380882297891272]
Python library, MRCpy, implements minimax risk classifiers (MRCs) based on the robust risk minimization (RRM) approach.
MRCpy follows the standards of popular Python libraries, such as scikit-learn, facilitating readability and easy usage together with a seamless integration with other libraries.
arXiv Detail & Related papers (2021-08-04T10:31:20Z) - Closing the Closed-Loop Distribution Shift in Safe Imitation Learning [80.05727171757454]
We treat safe optimization-based control strategies as experts in an imitation learning problem.
We train a learned policy that can be cheaply evaluated at run-time and that provably satisfies the same safety guarantees as the expert.
arXiv Detail & Related papers (2021-02-18T05:11:41Z) - Chance-Constrained Trajectory Optimization for Safe Exploration and
Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training.
We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.