Reinforcement Learning-Driven Adaptation Chains: A Robust Framework for Multi-Cloud Workflow Security
- URL: http://arxiv.org/abs/2501.06305v1
- Date: Fri, 10 Jan 2025 19:04:55 GMT
- Title: Reinforcement Learning-Driven Adaptation Chains: A Robust Framework for Multi-Cloud Workflow Security
- Authors: Nafiseh Soveizi, Dimka Karastoyanova,
- Abstract summary: Cloud computing has emerged as a crucial solution for managing data-intensive and compute-intensive tasks.<n>One of the main gaps in the literature is the lack of robust and flexible measures for reacting to security violations.<n>We propose an innovative approach leveraging Reinforcement Learning (RL) to formulate adaptation chains, responding effectively to security violations.
- Score: 2.186901738997927
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cloud computing has emerged as a crucial solution for managing data- and compute-intensive workflows, offering scalability to address dynamic demands. However, security concerns persist, especially for workflows involving sensitive data and tasks. One of the main gaps in the literature is the lack of robust and flexible measures for reacting to these security violations. To address this, we propose an innovative approach leveraging Reinforcement Learning (RL) to formulate adaptation chains, responding effectively to security violations within cloud-based workflows. These chains consist of sequences of adaptation actions tailored to attack characteristics, workflow dependencies, and user-defined requirements. Unlike conventional single-task adaptations, adaptation chains provide a comprehensive mitigation strategy by taking into account both control and data dependencies between tasks, thereby accommodating conflicting objectives effectively. Moreover, our RL-based approach uses insights from past responses to mitigate uncertainties associated with adaptation costs. We evaluate the method using our jBPM and Cloudsim Plus based implementation and compare the impact of selected adaptation chains on workflows with the single adaptation approach. Results demonstrate that the adaptation chain approach outperforms in terms of total adaptation cost, offering resilience and adaptability against security threats.
Related papers
- Representation-based Reward Modeling for Efficient Safety Alignment of Large Language Model [84.00480999255628]
Reinforcement Learning algorithms for safety alignment of Large Language Models (LLMs) encounter the challenge of distribution shift.
Current approaches typically address this issue through online sampling from the target policy.
We propose a new framework that leverages the model's intrinsic safety judgment capability to extract reward signals.
arXiv Detail & Related papers (2025-03-13T06:40:34Z) - Solving The Dynamic Volatility Fitting Problem: A Deep Reinforcement Learning Approach [0.0]
We show that variants of Deep Deterministic Policy Gradient (DDPG) and Soft Actor Critic (SAC) can achieve at least as good as standard fitting algorithms.
We explain why the reinforcement learning framework is appropriate to handle complex objective functions.
arXiv Detail & Related papers (2024-10-15T17:10:54Z) - Reinforcement Learning-based Receding Horizon Control using Adaptive Control Barrier Functions for Safety-Critical Systems [14.166970599802324]
Optimal control methods provide solutions to safety-critical problems but easily become intractable.
We propose a Reinforcement Learning-based Receding Horizon Control approach leveraging Model Predictive Control.
We validate our method by applying it to the challenging automated merging control problem for Connected and Automated Vehicles.
arXiv Detail & Related papers (2024-03-26T02:49:08Z) - Enhancing Security in Federated Learning through Adaptive
Consensus-Based Model Update Validation [2.28438857884398]
This paper introduces an advanced approach for fortifying Federated Learning (FL) systems against label-flipping attacks.
We propose a consensus-based verification process integrated with an adaptive thresholding mechanism.
Our results indicate a significant mitigation of label-flipping attacks, bolstering the FL system's resilience.
arXiv Detail & Related papers (2024-03-05T20:54:56Z) - Resilient Constrained Reinforcement Learning [87.4374430686956]
We study a class of constrained reinforcement learning (RL) problems in which multiple constraint specifications are not identified before study.
It is challenging to identify appropriate constraint specifications due to the undefined trade-off between the reward training objective and the constraint satisfaction.
We propose a new constrained RL approach that searches for policy and constraint specifications together.
arXiv Detail & Related papers (2023-12-28T18:28:23Z) - Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions.
We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training.
As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z) - Enhancing Workflow Security in Multi-Cloud Environments through
Monitoring and Adaptation upon Cloud Service and Network Security Violations [2.5835347022640254]
We propose an approach that focuses on monitoring cloud services and networks to detect security violations during workflow executions.
Our approach is evaluated based on the performance of the detection procedure and the impact of the selected adaptations on the workflow.
arXiv Detail & Related papers (2023-10-03T08:33:46Z) - Online Safety Property Collection and Refinement for Safe Deep
Reinforcement Learning in Mapless Navigation [79.89605349842569]
We introduce the Collection and Refinement of Online Properties (CROP) framework to design properties at training time.
CROP employs a cost signal to identify unsafe interactions and use them to shape safety properties.
We evaluate our approach in several robotic mapless navigation tasks and demonstrate that the violation metric computed with CROP allows higher returns and lower violations over previous Safe DRL approaches.
arXiv Detail & Related papers (2023-02-13T21:19:36Z) - Safe Policy Improvement in Constrained Markov Decision Processes [10.518340300810504]
We present a solution to the synthesis problem by solving its two main challenges: reward-shaping from a set of formal requirements and safe policy update.
For the former, we propose an automatic reward-shaping procedure, defining a scalar reward signal compliant with the task specification.
For the latter, we introduce an algorithm ensuring that the policy is improved in a safe fashion with high-confidence guarantees.
arXiv Detail & Related papers (2022-10-20T13:29:32Z) - FIRE: A Failure-Adaptive Reinforcement Learning Framework for Edge Computing Migrations [52.85536740465277]
FIRE is a framework that adapts to rare events by training a RL policy in an edge computing digital twin environment.
We propose ImRE, an importance sampling-based Q-learning algorithm, which samples rare events proportionally to their impact on the value function.
We show that FIRE reduces costs compared to vanilla RL and the greedy baseline in the event of failures.
arXiv Detail & Related papers (2022-09-28T19:49:39Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - Lifelong Unsupervised Domain Adaptive Person Re-identification with
Coordinated Anti-forgetting and Adaptation [127.6168183074427]
We propose a new task, Lifelong Unsupervised Domain Adaptive (LUDA) person ReID.
This is challenging because it requires the model to continuously adapt to unlabeled data of the target environments.
We design an effective scheme for this task, dubbed CLUDA-ReID, where the anti-forgetting is harmoniously coordinated with the adaptation.
arXiv Detail & Related papers (2021-12-13T13:19:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.