Related papers: A System for Interactive Examination of Learned Security Policies

A System for Interactive Examination of Learned Security Policies

URL: http://arxiv.org/abs/2204.01126v1
Date: Sun, 3 Apr 2022 17:55:32 GMT
Title: A System for Interactive Examination of Learned Security Policies
Authors: Kim Hammar and Rolf Stadler
Abstract summary: We present a system for interactive examination of learned security policies. It allows a user to traverse episodes of Markov decision processes in a controlled manner.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: We present a system for interactive examination of learned security policies. It allows a user to traverse episodes of Markov decision processes in a controlled manner and to track the actions triggered by security policies. Similar to a software debugger, a user can continue or or halt an episode at any time step and inspect parameters and probability distributions of interest. The system enables insight into the structure of a given policy and in the behavior of a policy in edge cases. We demonstrate the system with a network intrusion use case. We examine the evolution of an IT infrastructure's state and the actions prescribed by security policies while an attack occurs. The policies for the demonstration have been obtained through a reinforcement learning approach that includes a simulation system where policies are incrementally learned and an emulation system that produces statistics that drive the simulation runs.

Related papers

Dense Policy: Bidirectional Autoregressive Learning of Actions [51.60428100831717]
This paper introduces a bidirectionally expanded learning approach, termed Dense Policy, to establish a new paradigm for autoregressive policies in action prediction. It employs a lightweight encoder-only architecture to iteratively unfold the action sequence from an initial single frame into the target sequence in a coarse-to-fine manner. Experiments validate that our dense policy has superior autoregressive learning capabilities and can surpass existing holistic generative policies.
arXiv Detail & Related papers (2025-03-17T14:28:08Z)
Machine Learning-Based Security Policy Analysis [0.0]
Security-Enhanced Linux (SE Linux) is a robust security mechanism that enforces mandatory access controls (MAC) This research investigates the automation of SE Linux policy analysis using graph-based techniques combined with machine learning approaches to detect policy anomalies.
arXiv Detail & Related papers (2024-12-30T18:24:27Z)
Iterative Batch Reinforcement Learning via Safe Diversified Model-based Policy Search [2.0072624123275533]
Batch reinforcement learning enables policy learning without direct interaction with the environment during training. This approach is well-suited for high-risk and cost-intensive applications, such as industrial control. We present an algorithmic methodology for iterative batch reinforcement learning based on ensemble-based model-based policy search.
arXiv Detail & Related papers (2024-11-14T11:10:36Z)
Conformal Policy Learning for Sensorimotor Control Under Distribution Shifts [61.929388479847525]
This paper focuses on the problem of detecting and reacting to changes in the distribution of a sensorimotor controller's observables. The key idea is the design of switching policies that can take conformal quantiles as input. We show how to design such policies by using conformal quantiles to switch between base policies with different characteristics.
arXiv Detail & Related papers (2023-11-02T17:59:30Z)
Residual Q-Learning: Offline and Online Policy Customization without Value [53.47311900133564]
Imitation Learning (IL) is a widely used framework for learning imitative behavior from demonstrations. We formulate a new problem setting called policy customization. We propose a novel framework, Residual Q-learning, which can solve the formulated MDP by leveraging the prior policy.
arXiv Detail & Related papers (2023-06-15T22:01:19Z)
Interactive System-wise Anomaly Detection [66.3766756452743]
Anomaly detection plays a fundamental role in various applications. It is challenging for existing methods to handle the scenarios where the instances are systems whose characteristics are not readily observed as data. We develop an end-to-end approach which includes an encoder-decoder module that learns system embeddings.
arXiv Detail & Related papers (2023-04-21T02:20:24Z)
In-Distribution Barrier Functions: Self-Supervised Policy Filters that Avoid Out-of-Distribution States [84.24300005271185]
We propose a control filter that wraps any reference policy and effectively encourages the system to stay in-distribution with respect to offline-collected safe demonstrations. Our method is effective for two different visuomotor control tasks in simulation environments, including both top-down and egocentric view settings.
arXiv Detail & Related papers (2023-01-27T22:28:19Z)
Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers. We then present the pointwise feasibility conditions of the resulting safety controller. We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z)
Verified Probabilistic Policies for Deep Reinforcement Learning [6.85316573653194]
We tackle the problem of verifying probabilistic policies for deep reinforcement learning. We propose an abstraction approach, based on interval Markov decision processes, that yields guarantees on a policy's execution. We present techniques to build and solve these models using abstract interpretation, mixed-integer linear programming, entropy-based refinement and probabilistic model checking.
arXiv Detail & Related papers (2022-01-10T23:55:04Z)
Reinforcement Learning for Task Specifications with Action-Constraints [4.046919218061427]
We propose a method to learn optimal control policies for a finite-state Markov Decision Process. We assume that the set of action sequences that are deemed unsafe and/or safe are given in terms of a finite-state automaton. We present a version of the Q-learning algorithm for learning optimal policies in the presence of non-Markovian action-sequence and state constraints.
arXiv Detail & Related papers (2022-01-02T04:22:01Z)
Realistic simulation of users for IT systems in cyber ranges [63.20765930558542]
We instrument each machine by means of an external agent to generate user activity. This agent combines both deterministic and deep learning based methods to adapt to different environment. We also propose conditional text generation models to facilitate the creation of conversations and documents.
arXiv Detail & Related papers (2021-11-23T10:53:29Z)
Intrusion Prevention through Optimal Stopping [0.0]
We study automated intrusion prevention using reinforcement learning. We show that our approach can produce effective defender policies for a practical IT infrastructure of limited size.
arXiv Detail & Related papers (2021-10-30T17:03:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.