Related papers: Scalable Policy Maximization Under Network Interference

Scalable Policy Maximization Under Network Interference

URL: http://arxiv.org/abs/2505.18118v1
Date: Fri, 23 May 2025 17:19:12 GMT
Title: Scalable Policy Maximization Under Network Interference
Authors: Aidan Gleich, Eric Laber, Alexander Volfovsky,
Abstract summary: We study optimal-policy learning under interference on a dynamic network.<n>Under common assumptions on the structure of interference, rewards become linear.<n>We develop a scalable Thompson sampling algorithm that maximizes policy impact when a new $n$-node network is observed each round.
Score: 46.16641537379657
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Many interventions, such as vaccines in clinical trials or coupons in online marketplaces, must be assigned sequentially without full knowledge of their effects. Multi-armed bandit algorithms have proven successful in such settings. However, standard independence assumptions fail when the treatment status of one individual impacts the outcomes of others, a phenomenon known as interference. We study optimal-policy learning under interference on a dynamic network. Existing approaches to this problem require repeated observations of the same fixed network and struggle to scale in sample size beyond as few as fifteen connected units -- both limit applications. We show that under common assumptions on the structure of interference, rewards become linear. This enables us to develop a scalable Thompson sampling algorithm that maximizes policy impact when a new $n$-node network is observed each round. We prove a Bayesian regret bound that is sublinear in $n$ and the number of rounds. Simulation experiments show that our algorithm learns quickly and outperforms existing methods. The results close a key scalability gap between causal inference methods for interference and practical bandit algorithms, enabling policy optimization in large-scale networked systems.

Related papers

Bayesian Estimation of Causal Effects Using Proxies of a Latent Interference Network [0.39462888523270856]
Network interference occurs when treatments assigned to some units affect the outcomes of others.<n>Traditional approaches often assume that the observed network correctly specifies the interference structure.<n>We introduce a framework for estimating causal effects when only proxy networks are available.
arXiv Detail & Related papers (2025-05-13T09:46:30Z)
Robust Representation Consistency Model via Contrastive Denoising [83.47584074390842]
randomized smoothing provides theoretical guarantees for certifying robustness against adversarial perturbations.<n> diffusion models have been successfully employed for randomized smoothing to purify noise-perturbed samples.<n>We reformulate the generative modeling task along the diffusion trajectories in pixel space as a discriminative task in the latent space.
arXiv Detail & Related papers (2025-01-22T18:52:06Z)
Multi-Armed Bandits with Network Interference [5.708360037632367]
We study a multi-armed bandit (MAB) problem where a learner sequentially assigns one of possible $mathcalA$ actions (discounts) to $N$ units (goods) over $T$ rounds to minimize regret. Unlike traditional MAB problems, the reward of each unit depends on the treatments assigned to other units, i.e., there is interference across the underlying network of units. We propose simple, linear regression-based algorithms to minimize regret.
arXiv Detail & Related papers (2024-05-28T22:01:50Z)
Understanding the performance gap between online and offline alignment algorithms [63.137832242488926]
We show that offline algorithms train policy to become good at pairwise classification, while online algorithms are good at generations. This hints at a unique interplay between discriminative and generative capabilities, which is greatly impacted by the sampling process. Our study sheds light on the pivotal role of on-policy sampling in AI alignment, and hints at certain fundamental challenges of offline alignment algorithms.
arXiv Detail & Related papers (2024-05-14T09:12:30Z)
Interacting Particle Systems on Networks: joint inference of the network and the interaction kernel [8.535430501710712]
We infer the weight matrix of the network and systems which determine the rules of the interactions between agents. We use two algorithms: one is on a new algorithm named operator regression with alternating least squares of data. Both algorithms are scalable conditions guaranteeing identifiability and well-posedness.
arXiv Detail & Related papers (2024-02-13T12:29:38Z)
Optimization Guarantees of Unfolded ISTA and ADMM Networks With Smooth Soft-Thresholding [57.71603937699949]
We study optimization guarantees, i.e., achieving near-zero training loss with the increase in the number of learning epochs. We show that the threshold on the number of training samples increases with the increase in the network width.
arXiv Detail & Related papers (2023-09-12T13:03:47Z)
A Fast Algorithm for Moderating Critical Nodes via Edge Removal [19.130541561303293]
We study the problem of removing $k$ edges from a network to minimize the information centrality of a target node. We propose three approximation greedy algorithms using novel techniques such as random walk-based Schur complement approximation and fast sum estimation. To complement our theoretical analysis, we conduct a comprehensive set of experiments on synthetic and real networks with over one million nodes.
arXiv Detail & Related papers (2023-09-09T13:54:34Z)
Exploring Adversarial Attacks on Neural Networks: An Explainable Approach [18.063187159491182]
We analyze the response characteristics of the VGG-16 model when the input images are mixed with adversarial noise and statistically similar Gaussian random noise. Our work could provide valuable insights into developing more reliable Deep Neural Network (DNN) models.
arXiv Detail & Related papers (2023-03-08T07:59:44Z)
Large-Scale Sequential Learning for Recommender and Engineering Systems [91.3755431537592]
In this thesis, we focus on the design of an automatic algorithms that provide personalized ranking by adapting to the current conditions. For the former, we propose novel algorithm called SAROS that take into account both kinds of feedback for learning over the sequence of interactions. The proposed idea of taking into account the neighbour lines shows statistically significant results in comparison with the initial approach for faults detection in power grid.
arXiv Detail & Related papers (2022-05-13T21:09:41Z)
Online Adversarial Attacks [57.448101834579624]
We formalize the online adversarial attack problem, emphasizing two key elements found in real-world use-cases. We first rigorously analyze a deterministic variant of the online threat model. We then propose algoname, a simple yet practical algorithm yielding a provably better competitive ratio for $k=2$ over the current best single threshold algorithm.
arXiv Detail & Related papers (2021-03-02T20:36:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.