Deceptive Reinforcement Learning for Privacy-Preserving Planning
- URL: http://arxiv.org/abs/2102.03022v1
- Date: Fri, 5 Feb 2021 06:50:04 GMT
- Title: Deceptive Reinforcement Learning for Privacy-Preserving Planning
- Authors: Zhengshang Liu, Yue Yang, Tim Miller, and Peta Masters
- Abstract summary: Reinforcement learning is the problem of finding a behaviour policy based on rewards received from exploratory behaviour.
A key ingredient in reinforcement learning is a reward function, which determines how much reward (negative or positive) is given and when.
We present two models for solving the problem of privacy-preserving reinforcement learning.
- Score: 8.950168559003991
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study the problem of deceptive reinforcement learning to
preserve the privacy of a reward function. Reinforcement learning is the
problem of finding a behaviour policy based on rewards received from
exploratory behaviour. A key ingredient in reinforcement learning is a reward
function, which determines how much reward (negative or positive) is given and
when. However, in some situations, we may want to keep a reward function
private; that is, to make it difficult for an observer to determine the reward
function used. We define the problem of privacy-preserving reinforcement
learning, and present two models for solving it. These models are based on
dissimulation -- a form of deception that `hides the truth'. We evaluate our
models both computationally and via human behavioural experiments. Results show
that the resulting policies are indeed deceptive, and that participants can
determine the true reward function less reliably than that of an honest agent.
Related papers
- Multi Task Inverse Reinforcement Learning for Common Sense Reward [21.145179791929337]
We show that inverse reinforcement learning, even when it succeeds in training an agent, does not learn a useful reward function.
That is, training a new agent with the learned reward does not impair the desired behaviors.
That is, multi-task inverse reinforcement learning can be applied to learn a useful reward function.
arXiv Detail & Related papers (2024-02-17T19:49:00Z) - On The Fragility of Learned Reward Functions [4.826574398803286]
We study the causes of relearning failures in the domain of preference-based reward learning.
Based on our findings, we emphasize the need for more retraining-based evaluations in the literature.
arXiv Detail & Related papers (2023-01-09T19:45:38Z) - Basis for Intentions: Efficient Inverse Reinforcement Learning using
Past Experience [89.30876995059168]
inverse reinforcement learning (IRL) -- inferring the reward function of an agent from observing its behavior.
This paper addresses the problem of IRL -- inferring the reward function of an agent from observing its behavior.
arXiv Detail & Related papers (2022-08-09T17:29:49Z) - Causal Confusion and Reward Misidentification in Preference-Based Reward
Learning [33.944367978407904]
We study causal confusion and reward misidentification when learning from preferences.
We find that the presence of non-causal distractor features, noise in the stated preferences, and partial state observability can all exacerbate reward misidentification.
arXiv Detail & Related papers (2022-04-13T18:41:41Z) - Mutual Information State Intrinsic Control [91.38627985733068]
Intrinsically motivated RL attempts to remove this constraint by defining an intrinsic reward function.
Motivated by the self-consciousness concept in psychology, we make a natural assumption that the agent knows what constitutes itself.
We mathematically formalize this reward as the mutual information between the agent state and the surrounding state.
arXiv Detail & Related papers (2021-03-15T03:03:36Z) - Semi-supervised reward learning for offline reinforcement learning [71.6909757718301]
Training agents usually requires reward functions, but rewards are seldom available in practice and their engineering is challenging and laborious.
We propose semi-supervised learning algorithms that learn from limited annotations and incorporate unlabelled data.
In our experiments with a simulated robotic arm, we greatly improve upon behavioural cloning and closely approach the performance achieved with ground truth rewards.
arXiv Detail & Related papers (2020-12-12T20:06:15Z) - Understanding Learned Reward Functions [6.714172005695389]
We investigate techniques for interpreting learned reward functions.
In particular, we apply saliency methods to identify failure modes and predict the robustness of reward functions.
We find that learned reward functions often implement surprising algorithms that rely on contingent aspects of the environment.
arXiv Detail & Related papers (2020-12-10T18:19:48Z) - Counterfactual Credit Assignment in Model-Free Reinforcement Learning [47.79277857377155]
Credit assignment in reinforcement learning is the problem of measuring an action's influence on future rewards.
We adapt the notion of counterfactuals from causality theory to a model-free RL setup.
We formulate a family of policy algorithms that use future-conditional value functions as baselines or critics, and show that they are provably low variance.
arXiv Detail & Related papers (2020-11-18T18:41:44Z) - Maximizing Information Gain in Partially Observable Environments via
Prediction Reward [64.24528565312463]
This paper tackles the challenge of using belief-based rewards for a deep RL agent.
We derive the exact error between negative entropy and the expected prediction reward.
This insight provides theoretical motivation for several fields using prediction rewards.
arXiv Detail & Related papers (2020-05-11T08:13:49Z) - Intrinsic Motivation for Encouraging Synergistic Behavior [55.10275467562764]
We study the role of intrinsic motivation as an exploration bias for reinforcement learning in sparse-reward synergistic tasks.
Our key idea is that a good guiding principle for intrinsic motivation in synergistic tasks is to take actions which affect the world in ways that would not be achieved if the agents were acting on their own.
arXiv Detail & Related papers (2020-02-12T19:34:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.