Related papers: Deceptive Kernel Function on Observations of Discrete POMDP

Deceptive Kernel Function on Observations of Discrete POMDP

URL: http://arxiv.org/abs/2008.05585v1
Date: Wed, 12 Aug 2020 21:59:42 GMT
Title: Deceptive Kernel Function on Observations of Discrete POMDP
Authors: Zhili Zhang and Quanyan Zhu
Abstract summary: We introduce deceptive kernel function (the kernel) applied to agent's observations in a discrete POMDP. We analyze its belief being misled by falsified observations as the kernel's outputs and anticipate its probable threat on agent's reward and potentially other performance.
Score: 34.32166929236478
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper studies the deception applied on agent in a partially observable Markov decision process. We introduce deceptive kernel function (the kernel) applied to agent's observations in a discrete POMDP. Based on value iteration, value function approximation and POMCP three characteristic algorithms used by agent, we analyze its belief being misled by falsified observations as the kernel's outputs and anticipate its probable threat on agent's reward and potentially other performance. We validate our expectation and explore more detrimental effects of the deception by experimenting on two POMDP problems. The result shows that the kernel applied on agent's observation can affect its belief and substantially lower its resulting rewards; meantime certain implementation of the kernel could induce other abnormal behaviors by the agent.

Related papers

Observer-Aware Probabilistic Planning Under Partial Observability [3.8506666685467343]
Building on observer-aware Markov decision processes (OAMDPs), we propose a framework to handle partial observability problems. This extension of OAMDPs to partial observability can not only handle more realistic problems, but also permits considering dynamic hidden variables of interest.
arXiv Detail & Related papers (2025-02-14T21:41:04Z)
On Multi-Agent Inverse Reinforcement Learning [8.284137254112848]
We extend the Inverse Reinforcement Learning (IRL) framework to the multi-agent setting, assuming to observe agents who are following Nash Equilibrium (NE) policies. We provide an explicit characterization of the feasible reward set and analyze how errors in estimating the transition dynamics and expert behavior impact the recovered rewards.
arXiv Detail & Related papers (2024-11-22T16:31:36Z)
An Overview of Causal Inference using Kernel Embeddings [14.298666697532838]
Kernel embeddings have emerged as a powerful tool for representing probability measures in a variety of statistical inference problems. Main challenges include identifying causal associations and estimating the average treatment effect from observational data.
arXiv Detail & Related papers (2024-10-30T07:23:34Z)
How to Exhibit More Predictable Behaviors [3.5248694676821484]
This paper looks at predictability problems wherein an agent must choose its strategy in order to optimize predictions that an external observer could make. We take into account uncertainties on the environment dynamics and on the observed agent's policy. We propose action and state predictability performance criteria through reward functions built on the observer's belief about the agent policy.
arXiv Detail & Related papers (2024-04-17T12:06:17Z)
Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in Partially Observed Markov Decision Processes [65.91730154730905]
In applications of offline reinforcement learning to observational data, such as in healthcare or education, a general concern is that observed actions might be affected by unobserved factors. Here we tackle this by considering off-policy evaluation in a partially observed Markov decision process (POMDP) We extend the framework of proximal causal inference to our POMDP setting, providing a variety of settings where identification is made possible.
arXiv Detail & Related papers (2021-10-28T17:46:14Z)
Coalitional Bayesian Autoencoders -- Towards explainable unsupervised deep learning [78.60415450507706]
We show that explanations of BAE's predictions suffer from high correlation resulting in misleading explanations. To alleviate this, a "Coalitional BAE" is proposed, which is inspired by agent-based system theory. Our experiments on publicly available condition monitoring datasets demonstrate the improved quality of explanations using the Coalitional BAE.
arXiv Detail & Related papers (2021-10-19T15:07:09Z)
Nested Counterfactual Identification from Arbitrary Surrogate Experiments [95.48089725859298]
We study the identification of nested counterfactuals from an arbitrary combination of observations and experiments. Specifically, we prove the counterfactual unnesting theorem (CUT), which allows one to map arbitrary nested counterfactuals to unnested ones.
arXiv Detail & Related papers (2021-07-07T12:51:04Z)
Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues. We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders. We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z)
Maximizing Information Gain in Partially Observable Environments via Prediction Reward [64.24528565312463]
This paper tackles the challenge of using belief-based rewards for a deep RL agent. We derive the exact error between negative entropy and the expected prediction reward. This insight provides theoretical motivation for several fields using prediction rewards.
arXiv Detail & Related papers (2020-05-11T08:13:49Z)
Estimating Treatment Effects with Observed Confounders and Mediators [25.338901482522648]
Given a causal graph, the do-calculus can express treatment effects as functionals of the observational joint distribution that can be estimated empirically. Sometimes the do-calculus identifies multiple valid formulae, prompting us to compare the statistical properties of the corresponding estimators. In this paper, we investigate the over-identified scenario where both confounders and mediators are observed, rendering both estimators valid.
arXiv Detail & Related papers (2020-03-26T15:50:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.