Deceptive Kernel Function on Observations of Discrete POMDP
        - URL: http://arxiv.org/abs/2008.05585v1
- Date: Wed, 12 Aug 2020 21:59:42 GMT
- Title: Deceptive Kernel Function on Observations of Discrete POMDP
- Authors: Zhili Zhang and Quanyan Zhu
- Abstract summary: We introduce deceptive kernel function (the kernel) applied to agent's observations in a discrete POMDP.
We analyze its belief being misled by falsified observations as the kernel's outputs and anticipate its probable threat on agent's reward and potentially other performance.
- Score: 34.32166929236478
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   This paper studies the deception applied on agent in a partially observable
Markov decision process. We introduce deceptive kernel function (the kernel)
applied to agent's observations in a discrete POMDP. Based on value iteration,
value function approximation and POMCP three characteristic algorithms used by
agent, we analyze its belief being misled by falsified observations as the
kernel's outputs and anticipate its probable threat on agent's reward and
potentially other performance. We validate our expectation and explore more
detrimental effects of the deception by experimenting on two POMDP problems.
The result shows that the kernel applied on agent's observation can affect its
belief and substantially lower its resulting rewards; meantime certain
implementation of the kernel could induce other abnormal behaviors by the
agent.
 
      
        Related papers
        - Observer-Aware Probabilistic Planning Under Partial Observability [3.8506666685467343]
 Building on observer-aware Markov decision processes (OAMDPs), we propose a framework to handle partial observability problems.
This extension of OAMDPs to partial observability can not only handle more realistic problems, but also permits considering dynamic hidden variables of interest.
 arXiv  Detail & Related papers  (2025-02-14T21:41:04Z)
- On Multi-Agent Inverse Reinforcement Learning [8.284137254112848]
 We extend the Inverse Reinforcement Learning (IRL) framework to the multi-agent setting, assuming to observe agents who are following Nash Equilibrium (NE) policies.
We provide an explicit characterization of the feasible reward set and analyze how errors in estimating the transition dynamics and expert behavior impact the recovered rewards.
 arXiv  Detail & Related papers  (2024-11-22T16:31:36Z)
- An Overview of Causal Inference using Kernel Embeddings [14.298666697532838]
 Kernel embeddings have emerged as a powerful tool for representing probability measures in a variety of statistical inference problems.
Main challenges include identifying causal associations and estimating the average treatment effect from observational data.
 arXiv  Detail & Related papers  (2024-10-30T07:23:34Z)
- How to Exhibit More Predictable Behaviors [3.5248694676821484]
 This paper looks at predictability problems wherein an agent must choose its strategy in order to optimize predictions that an external observer could make.
We take into account uncertainties on the environment dynamics and on the observed agent's policy.
We propose action and state predictability performance criteria through reward functions built on the observer's belief about the agent policy.
 arXiv  Detail & Related papers  (2024-04-17T12:06:17Z)
- Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in
  Partially Observed Markov Decision Processes [65.91730154730905]
 In applications of offline reinforcement learning to observational data, such as in healthcare or education, a general concern is that observed actions might be affected by unobserved factors.
Here we tackle this by considering off-policy evaluation in a partially observed Markov decision process (POMDP)
We extend the framework of proximal causal inference to our POMDP setting, providing a variety of settings where identification is made possible.
 arXiv  Detail & Related papers  (2021-10-28T17:46:14Z)
- Coalitional Bayesian Autoencoders -- Towards explainable unsupervised
  deep learning [78.60415450507706]
 We show that explanations of BAE's predictions suffer from high correlation resulting in misleading explanations.
To alleviate this, a "Coalitional BAE" is proposed, which is inspired by agent-based system theory.
Our experiments on publicly available condition monitoring datasets demonstrate the improved quality of explanations using the Coalitional BAE.
 arXiv  Detail & Related papers  (2021-10-19T15:07:09Z)
- Nested Counterfactual Identification from Arbitrary Surrogate
  Experiments [95.48089725859298]
 We study the identification of nested counterfactuals from an arbitrary combination of observations and experiments.
Specifically, we prove the counterfactual unnesting theorem (CUT), which allows one to map arbitrary nested counterfactuals to unnested ones.
 arXiv  Detail & Related papers  (2021-07-07T12:51:04Z)
- Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
 Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues.
We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders.
We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
 arXiv  Detail & Related papers  (2021-06-07T17:47:16Z)
- Maximizing Information Gain in Partially Observable Environments via
  Prediction Reward [64.24528565312463]
 This paper tackles the challenge of using belief-based rewards for a deep RL agent.
We derive the exact error between negative entropy and the expected prediction reward.
This insight provides theoretical motivation for several fields using prediction rewards.
 arXiv  Detail & Related papers  (2020-05-11T08:13:49Z)
- Estimating Treatment Effects with Observed Confounders and Mediators [25.338901482522648]
 Given a causal graph, the do-calculus can express treatment effects as functionals of the observational joint distribution that can be estimated empirically.
Sometimes the do-calculus identifies multiple valid formulae, prompting us to compare the statistical properties of the corresponding estimators.
In this paper, we investigate the over-identified scenario where both confounders and mediators are observed, rendering both estimators valid.
 arXiv  Detail & Related papers  (2020-03-26T15:50:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.