Related papers: Privacy-Preserving Reinforcement Learning Beyond Expectation

Privacy-Preserving Reinforcement Learning Beyond Expectation

URL: http://arxiv.org/abs/2203.10165v1
Date: Fri, 18 Mar 2022 21:28:29 GMT
Title: Privacy-Preserving Reinforcement Learning Beyond Expectation
Authors: Arezoo Rajabi, Bhaskar Ramasubramanian, Abdullah Al Maruf, Radha Poovendran
Abstract summary: Cyber and cyber-physical systems equipped with machine learning algorithms such as autonomous cars share environments with humans. It is important to align system (or agent) behaviors with the preferences of one or more human users. We consider the case when an agent has to learn behaviors in an unknown environment.
Score: 6.495883501989546
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Cyber and cyber-physical systems equipped with machine learning algorithms such as autonomous cars share environments with humans. In such a setting, it is important to align system (or agent) behaviors with the preferences of one or more human users. We consider the case when an agent has to learn behaviors in an unknown environment. Our goal is to capture two defining characteristics of humans: i) a tendency to assess and quantify risk, and ii) a desire to keep decision making hidden from external parties. We incorporate cumulative prospect theory (CPT) into the objective of a reinforcement learning (RL) problem for the former. For the latter, we use differential privacy. We design an algorithm to enable an RL agent to learn policies to maximize a CPT-based objective in a privacy-preserving manner and establish guarantees on the privacy of value functions learned by the algorithm when rewards are sufficiently close. This is accomplished through adding a calibrated noise using a Gaussian process mechanism at each step. Through empirical evaluations, we highlight a privacy-utility tradeoff and demonstrate that the RL agent is able to learn behaviors that are aligned with that of a human user in the same environment in a privacy-preserving manner

Related papers

Differentially Private Random Feature Model [52.468511541184895]
We produce a differentially private random feature model for privacy-preserving kernel machines. We show that our method preserves privacy and derive a generalization error bound for the method.
arXiv Detail & Related papers (2024-12-06T05:31:08Z)
PEaRL: Personalized Privacy of Human-Centric Systems using Early-Exit Reinforcement Learning [0.5317624228510748]
This paper introduces PEaRL, a system designed to enhance privacy preservation by tailoring its approach to individual behavioral patterns and preferences. On average, across both systems, PEaRL enhances privacy protection by 31%, with a corresponding utility reduction of 24%.
arXiv Detail & Related papers (2024-03-09T10:24:12Z)
Group Decision-Making among Privacy-Aware Agents [2.4401219403555814]
Preserving individual privacy and enabling efficient social learning are both important desiderata but seem fundamentally at odds with each other. We do so by controlling information leakage using rigorous statistical guarantees that are based on differential privacy (DP) Our results flesh out the nature of the trade-offs in both cases between the quality of the group decision outcomes, learning accuracy, communication cost, and the level of privacy protections that the agents are afforded.
arXiv Detail & Related papers (2024-02-13T01:38:01Z)
REBEL: Reward Regularization-Based Approach for Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and human preferences can lead to catastrophic outcomes in the real world. Recent methods aim to mitigate misalignment by learning reward functions from human preferences. We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z)
Privacy Preserving Large Language Models: ChatGPT Case Study Based Vision and Framework [6.828884629694705]
This article proposes the conceptual model called PrivChatGPT, a privacy-generative model for LLMs. PrivChatGPT consists of two main components i.e., preserving user privacy during the data curation/pre-processing together with preserving private context and the private training process for large-scale data.
arXiv Detail & Related papers (2023-10-19T06:55:13Z)
Your Room is not Private: Gradient Inversion Attack on Reinforcement Learning [47.96266341738642]
Privacy emerges as a pivotal concern within the realm of embodied AI, as the robot accesses substantial personal information. This paper proposes an attack on the value-based algorithm and the gradient-based algorithm, utilizing gradient inversion to reconstruct states, actions, and supervision signals.
arXiv Detail & Related papers (2023-06-15T16:53:26Z)
Theoretically Principled Federated Learning for Balancing Privacy and Utility [61.03993520243198]
We propose a general learning framework for the protection mechanisms that protects privacy via distorting model parameters. It can achieve personalized utility-privacy trade-off for each model parameter, on each client, at each communication round in federated learning.
arXiv Detail & Related papers (2023-05-24T13:44:02Z)
adaPARL: Adaptive Privacy-Aware Reinforcement Learning for Sequential-Decision Making Human-in-the-Loop Systems [0.5414308305392761]
Reinforcement learning (RL) presents numerous benefits compared to rule-based approaches in various applications. We propose adaPARL, an adaptive approach for privacy-aware RL, especially for human-in-the-loop IoT systems. AdaPARL provides a personalized privacy-utility trade-off depending on human behavior and preference.
arXiv Detail & Related papers (2023-03-07T21:55:22Z)
A Multiplicative Value Function for Safe and Efficient Reinforcement Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic. The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns. We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z)
Tight Auditing of Differentially Private Machine Learning [77.38590306275877]
For private machine learning, existing auditing mechanisms are tight. They only give tight estimates under implausible worst-case assumptions. We design an improved auditing scheme that yields tight privacy estimates for natural (not adversarially crafted) datasets.
arXiv Detail & Related papers (2023-02-15T21:40:33Z)
Differentially Private Stochastic Gradient Descent with Low-Noise [49.981789906200035]
Modern machine learning algorithms aim to extract fine-grained information from data to provide accurate predictions, which often conflicts with the goal of privacy protection. This paper addresses the practical and theoretical importance of developing privacy-preserving machine learning algorithms that ensure good performance while preserving privacy.
arXiv Detail & Related papers (2022-09-09T08:54:13Z)
Reinforcement Learning Beyond Expectation [11.428014000851535]
Cumulative prospect theory (CPT) is a paradigm that has been empirically shown to model a tendency of humans to view gains and losses differently. In this paper, we consider a setting where an autonomous agent has to learn behaviors in an unknown environment. In order to endow the agent with the ability to closely mimic the behavior of human users, we optimize a CPT-based cost.
arXiv Detail & Related papers (2021-03-29T20:35:25Z)
Tempered Sigmoid Activations for Deep Learning with Differential Privacy [33.574715000662316]
We show that the choice of activation function is central to bounding the sensitivity of privacy-preserving deep learning. We achieve new state-of-the-art accuracy on MNIST, FashionMNIST, and CIFAR10 without any modification of the learning procedure fundamentals.
arXiv Detail & Related papers (2020-07-28T13:19:45Z)
Cooperative Inverse Reinforcement Learning [64.60722062217417]
We propose a formal definition of the value alignment problem as cooperative reinforcement learning (CIRL) A CIRL problem is a cooperative, partial-information game with two agents human and robot; both are rewarded according to the human's reward function, but the robot does not initially know what this is. In contrast to classical IRL, where the human is assumed to act optimally in isolation, optimal CIRL solutions produce behaviors such as active teaching, active learning, and communicative actions.
arXiv Detail & Related papers (2016-06-09T22:39:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.