Related papers: Off-Policy Evaluation for Sequential Persuasion Process with Unobserved Confounding

Off-Policy Evaluation for Sequential Persuasion Process with Unobserved Confounding

URL: http://arxiv.org/abs/2504.01211v1
Date: Tue, 01 Apr 2025 21:50:32 GMT
Title: Off-Policy Evaluation for Sequential Persuasion Process with Unobserved Confounding
Authors: Nishanth Venkatesh S., Heeseung Bang, Andreas A. Malikopoulos,
Abstract summary: Real-world scenarios often involve hidden variables that impact the receiver's belief formation and decision-making.<n>We conceptualize this as a sequential decision-making problem, where the sender and receiver interact over multiple rounds.<n>By reformulating this scenario as a Partially Observable Markov Decision Process (POMDP), we capture the sender's incomplete information regarding both the dynamics of the receiver's beliefs and the unobserved confounder.
Score: 2.7282382992043885
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we expand the Bayesian persuasion framework to account for unobserved confounding variables in sender-receiver interactions. While traditional models assume that belief updates follow Bayesian principles, real-world scenarios often involve hidden variables that impact the receiver's belief formation and decision-making. We conceptualize this as a sequential decision-making problem, where the sender and receiver interact over multiple rounds. In each round, the sender communicates with the receiver, who also interacts with the environment. Crucially, the receiver's belief update is affected by an unobserved confounding variable. By reformulating this scenario as a Partially Observable Markov Decision Process (POMDP), we capture the sender's incomplete information regarding both the dynamics of the receiver's beliefs and the unobserved confounder. We prove that finding an optimal observation-based policy in this POMDP is equivalent to solving for an optimal signaling strategy in the original persuasion framework. Furthermore, we demonstrate how this reformulation facilitates the application of proximal learning for off-policy evaluation in the persuasion process. This advancement enables the sender to evaluate alternative signaling strategies using only observational data from a behavioral policy, thus eliminating the necessity for costly new experiments.

Related papers

Information Bargaining: Bilateral Commitment in Bayesian Persuasion [60.3761154043329]
We introduce a unified framework and a well-structured solution concept for long-term persuasion.<n>This perspective makes explicit the common knowledge of the game structure and grants the receiver comparable commitment capabilities.<n>The framework is validated through a two-stage validation-and-inference paradigm.
arXiv Detail & Related papers (2025-06-06T08:42:34Z)
Uncertainty in Repeated Implicit Feedback as a Measure of Reliability [12.441205946216192]
Implicit and explicit feedback are prone to noise due to variability in human interactions.<n>In collaborative filtering, the reliability of interaction signals is critical, as these signals determine user and item similarities.<n>We analyze how repetition patterns intersect with key factors influencing user interest and develop methods to quantify the associated uncertainty.
arXiv Detail & Related papers (2025-05-05T09:18:47Z)
iEBAKER: Improved Remote Sensing Image-Text Retrieval Framework via Eliminate Before Align and Keyword Explicit Reasoning [80.44805667907612]
iEBAKER is an innovative strategy to filter weakly correlated sample pairs. We introduce an alternative Sort After Reversed Retrieval (SAR) strategy. We incorporate a Keyword Explicit Reasoning (KER) module to facilitate the beneficial impact of subtle key concept distinctions.
arXiv Detail & Related papers (2025-04-08T03:40:19Z)
Causal Influence in Federated Edge Inference [34.487472866247586]
In this paper, we consider a setting where heterogeneous agents with connectivity are performing inference using unlabeled streaming data. In order to overcome the uncertainty, agents cooperate with each other by exchanging their local inferences with and through a fusion center. Various scenarios reflecting different agent participation patterns and fusion center policies are investigated.
arXiv Detail & Related papers (2024-05-02T13:06:50Z)
Randomized Confidence Bounds for Stochastic Partial Monitoring [8.649322557020666]
Partial monitoring (PM) framework provides a theoretical formulation of sequential learning problems with incomplete feedback. In contextual PM, the outcomes depend on some side information that is observable by the agent before selecting the action on each round. We introduce a new class of PM strategies based on the randomization of deterministic confidence bounds.
arXiv Detail & Related papers (2024-02-07T16:18:59Z)
Markov Persuasion Processes: Learning to Persuade from Scratch [37.92189925462977]
In Bayesian persuasion, an informed sender strategically discloses information to a receiver so as to persuade them to undertake desirable actions. We design a learning algorithm for the sender, working with partial feedback. We prove that its regret with respect to an optimal information-disclosure policy grows sublinearly in the number of episodes.
arXiv Detail & Related papers (2024-02-05T15:09:41Z)
Explaining by Imitating: Understanding Decisions by Interpretable Policy Learning [72.80902932543474]
Understanding human behavior from observed data is critical for transparency and accountability in decision-making. Consider real-world settings such as healthcare, in which modeling a decision-maker's policy is challenging. We propose a data-driven representation of decision-making behavior that inheres transparency by design, accommodates partial observability, and operates completely offline.
arXiv Detail & Related papers (2023-10-28T13:06:14Z)
Online Decision Mediation [72.80902932543474]
Consider learning a decision support assistant to serve as an intermediary between (oracle) expert behavior and (imperfect) human behavior. In clinical diagnosis, fully-autonomous machine behavior is often beyond ethical affordances.
arXiv Detail & Related papers (2023-10-28T05:59:43Z)
Debiasing Recommendation by Learning Identifiable Latent Confounders [49.16119112336605]
Confounding bias arises due to the presence of unmeasured variables that can affect both a user's exposure and feedback. Existing methods either (1) make untenable assumptions about these unmeasured variables or (2) directly infer latent confounders from users' exposure. We propose a novel method, i.e., identifiable deconfounder (iDCF), which leverages a set of proxy variables to resolve the aforementioned non-identification issue.
arXiv Detail & Related papers (2023-02-10T05:10:26Z)
Policy Evaluation in Decentralized POMDPs with Belief Sharing [39.550233049869036]
We consider a cooperative policy evaluation task in which agents are not assumed to observe the environment state directly. We propose a fully decentralized belief forming strategy that relies on individual updates and on localized interactions over a communication network.
arXiv Detail & Related papers (2023-02-08T15:54:15Z)
Inverse Online Learning: Understanding Non-Stationary and Reactionary Policies [79.60322329952453]
We show how to develop interpretable representations of how agents make decisions. By understanding the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem. We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them. Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time.
arXiv Detail & Related papers (2022-03-14T17:40:42Z)
Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in Partially Observed Markov Decision Processes [65.91730154730905]
In applications of offline reinforcement learning to observational data, such as in healthcare or education, a general concern is that observed actions might be affected by unobserved factors. Here we tackle this by considering off-policy evaluation in a partially observed Markov decision process (POMDP) We extend the framework of proximal causal inference to our POMDP setting, providing a variety of settings where identification is made possible.
arXiv Detail & Related papers (2021-10-28T17:46:14Z)
Active recursive Bayesian inference using R\'enyi information measures [11.1748531496641]
We propose an active Bayesian inference framework with unified inference and query selection steps. We analytically demonstrate that the proposed approach outperforms conventional methods such as mutual information. We present empirical and experimental performance evaluations on two applications: restaurant recommendation and brain-computer interface (BCI) typing systems.
arXiv Detail & Related papers (2020-04-07T05:52:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.