Deceptive Kernel Function on Observations of Discrete POMDP
- URL: http://arxiv.org/abs/2008.05585v1
- Date: Wed, 12 Aug 2020 21:59:42 GMT
- Title: Deceptive Kernel Function on Observations of Discrete POMDP
- Authors: Zhili Zhang and Quanyan Zhu
- Abstract summary: We introduce deceptive kernel function (the kernel) applied to agent's observations in a discrete POMDP.
We analyze its belief being misled by falsified observations as the kernel's outputs and anticipate its probable threat on agent's reward and potentially other performance.
- Score: 34.32166929236478
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the deception applied on agent in a partially observable
Markov decision process. We introduce deceptive kernel function (the kernel)
applied to agent's observations in a discrete POMDP. Based on value iteration,
value function approximation and POMCP three characteristic algorithms used by
agent, we analyze its belief being misled by falsified observations as the
kernel's outputs and anticipate its probable threat on agent's reward and
potentially other performance. We validate our expectation and explore more
detrimental effects of the deception by experimenting on two POMDP problems.
The result shows that the kernel applied on agent's observation can affect its
belief and substantially lower its resulting rewards; meantime certain
implementation of the kernel could induce other abnormal behaviors by the
agent.
Related papers
- An Overview of Causal Inference using Kernel Embeddings [14.298666697532838]
Kernel embeddings have emerged as a powerful tool for representing probability measures in a variety of statistical inference problems.
Main challenges include identifying causal associations and estimating the average treatment effect from observational data.
arXiv Detail & Related papers (2024-10-30T07:23:34Z) - How to Exhibit More Predictable Behaviors [3.5248694676821484]
This paper looks at predictability problems wherein an agent must choose its strategy in order to optimize predictions that an external observer could make.
We take into account uncertainties on the environment dynamics and on the observed agent's policy.
We propose action and state predictability performance criteria through reward functions built on the observer's belief about the agent policy.
arXiv Detail & Related papers (2024-04-17T12:06:17Z) - PAC: Assisted Value Factorisation with Counterfactual Predictions in
Multi-Agent Reinforcement Learning [43.862956745961654]
Multi-agent reinforcement learning (MARL) has witnessed significant progress with the development of value function factorization methods.
In this paper, we show that in partially observable MARL problems, an agent's ordering over its own actions could impose concurrent constraints.
We propose PAC, a new framework leveraging information generated from Counterfactual Predictions of optimal joint action selection.
arXiv Detail & Related papers (2022-06-22T23:34:30Z) - Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in
Partially Observed Markov Decision Processes [65.91730154730905]
In applications of offline reinforcement learning to observational data, such as in healthcare or education, a general concern is that observed actions might be affected by unobserved factors.
Here we tackle this by considering off-policy evaluation in a partially observed Markov decision process (POMDP)
We extend the framework of proximal causal inference to our POMDP setting, providing a variety of settings where identification is made possible.
arXiv Detail & Related papers (2021-10-28T17:46:14Z) - Coalitional Bayesian Autoencoders -- Towards explainable unsupervised
deep learning [78.60415450507706]
We show that explanations of BAE's predictions suffer from high correlation resulting in misleading explanations.
To alleviate this, a "Coalitional BAE" is proposed, which is inspired by agent-based system theory.
Our experiments on publicly available condition monitoring datasets demonstrate the improved quality of explanations using the Coalitional BAE.
arXiv Detail & Related papers (2021-10-19T15:07:09Z) - Nested Counterfactual Identification from Arbitrary Surrogate
Experiments [95.48089725859298]
We study the identification of nested counterfactuals from an arbitrary combination of observations and experiments.
Specifically, we prove the counterfactual unnesting theorem (CUT), which allows one to map arbitrary nested counterfactuals to unnested ones.
arXiv Detail & Related papers (2021-07-07T12:51:04Z) - Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues.
We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders.
We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z) - Maximizing Information Gain in Partially Observable Environments via
Prediction Reward [64.24528565312463]
This paper tackles the challenge of using belief-based rewards for a deep RL agent.
We derive the exact error between negative entropy and the expected prediction reward.
This insight provides theoretical motivation for several fields using prediction rewards.
arXiv Detail & Related papers (2020-05-11T08:13:49Z) - Estimating Treatment Effects with Observed Confounders and Mediators [25.338901482522648]
Given a causal graph, the do-calculus can express treatment effects as functionals of the observational joint distribution that can be estimated empirically.
Sometimes the do-calculus identifies multiple valid formulae, prompting us to compare the statistical properties of the corresponding estimators.
In this paper, we investigate the over-identified scenario where both confounders and mediators are observed, rendering both estimators valid.
arXiv Detail & Related papers (2020-03-26T15:50:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.