Related papers: Multi-Agent Online Optimization with Delays: Asynchronicity, Adaptivity, and Optimism

Multi-Agent Online Optimization with Delays: Asynchronicity, Adaptivity, and Optimism

URL: http://arxiv.org/abs/2012.11579v1
Date: Mon, 21 Dec 2020 18:55:55 GMT
Title: Multi-Agent Online Optimization with Delays: Asynchronicity, Adaptivity, and Optimism
Authors: Yu-Guan Hsieh, Franck Iutzeler, J\'er\^ome Malick, Panayotis Mertikopoulos
Abstract summary: We study multi-agent online learning problems in the presence of delays and asynchronicities. We derive adaptive learning strategies with optimal regret bounds, at both the agent and network levels.
Score: 33.116006446428756
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Online learning has been successfully applied to many problems in which data are revealed over time. In this paper, we provide a general framework for studying multi-agent online learning problems in the presence of delays and asynchronicities. Specifically, we propose and analyze a class of adaptive dual averaging schemes in which agents only need to accumulate gradient feedback received from the whole system, without requiring any between-agent coordination. In the single-agent case, the adaptivity of the proposed method allows us to extend a range of existing results to problems with potentially unbounded delays between playing an action and receiving the corresponding feedback. In the multi-agent case, the situation is significantly more complicated because agents may not have access to a global clock to use as a reference point; to overcome this, we focus on the information that is available for producing each prediction rather than the actual delay associated with each feedback. This allows us to derive adaptive learning strategies with optimal regret bounds, at both the agent and network levels. Finally, we also analyze an "optimistic" variant of the proposed algorithm which is capable of exploiting the predictability of problems with a slower variation and leads to improved regret bounds.

Related papers

When Agents Break Down in Multiagent Path Finding [0.0]
We introduce a new variant that formally models scenarios where some agents may experience delays due to malfunctions.<n>We propose a framework for dynamic schedule adaptation that does not rely on full replanning.<n>We prove that following our primary communication protocol, the increase in makespan after k malfunctions is bounded by k additional turns.
arXiv Detail & Related papers (2025-08-05T12:59:30Z)
Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection [71.92083784393418]
Inference-time methods such as Best-of-N (BON) sampling offer a simple yet effective alternative to improve performance. We propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
Active Learning for Fair and Stable Online Allocations [6.23798328186465]
We consider feedback from a select subset of agents at each epoch of the online resource allocation process. Our algorithms provide regret bounds that are sub-linear in number of time-periods for various measures. We show that efficient decision-making does not require extensive feedback and produces efficient outcomes for a variety of problem classes.
arXiv Detail & Related papers (2024-06-20T23:23:23Z)
Online Decision Mediation [72.80902932543474]
Consider learning a decision support assistant to serve as an intermediary between (oracle) expert behavior and (imperfect) human behavior. In clinical diagnosis, fully-autonomous machine behavior is often beyond ethical affordances.
arXiv Detail & Related papers (2023-10-28T05:59:43Z)
Federated Multi-Level Optimization over Decentralized Networks [55.776919718214224]
We study the problem of distributed multi-level optimization over a network, where agents can only communicate with their immediate neighbors. We propose a novel gossip-based distributed multi-level optimization algorithm that enables networked agents to solve optimization problems at different levels in a single timescale. Our algorithm achieves optimal sample complexity, scaling linearly with the network size, and demonstrates state-of-the-art performance on various applications.
arXiv Detail & Related papers (2023-10-10T00:21:10Z)
Communication-Efficient Zeroth-Order Distributed Online Optimization: Algorithm, Theory, and Applications [9.045332526072828]
This paper focuses on a multi-agent zeroth-order online optimization problem in a federated learning setting for target tracking. The proposed solution is further analyzed in terms of errors and errors in two relevant applications.
arXiv Detail & Related papers (2023-06-09T03:51:45Z)
On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees. We study this question in a general framework for interactive decision making with multiple agents. We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z)
Multi-surrogate Assisted Efficient Global Optimization for Discrete Problems [0.9127162004615265]
This paper investigates the possible benefit of a concurrent utilization of multiple simulation-based surrogate models to solve discrete problems. Our findings indicate that SAMA-DiEGO can rapidly converge to better solutions on a majority of the test problems.
arXiv Detail & Related papers (2022-12-13T09:10:08Z)
Sequential Bayesian Optimization for Adaptive Informative Path Planning with Multimodal Sensing [34.86734745942814]
We consider the problem of an agent equipped with multiple sensors, each with different sensing accuracy and energy costs. The agent's goal is to explore the environment and gather information subject to its resource constraints in unknown, partially observable environments. We formulate the AIPPMS problem as a belief Markov decision process with Gaussian process beliefs and solve it using a sequential Bayesian optimization approach with online planning.
arXiv Detail & Related papers (2022-09-16T00:50:36Z)
Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback [98.7678704343537]
We focus on a class of nonsubmodular functions with special structure, and prove regret guarantees for several variants of the online and approximate online bandit gradient descent algorithms. We derive bounds for the agent's regret in the full information and bandit feedback setting, even if the delay between choosing a decision and receiving the incurred cost is unbounded.
arXiv Detail & Related papers (2022-05-15T08:27:12Z)
Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC) We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer. Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z)
Delay-Aware Multi-Agent Reinforcement Learning for Cooperative and Competitive Environments [23.301322095357808]
Action and observation delays exist prevalently in the real-world cyber-physical systems. This paper proposes a novel framework to deal with delays as well as the non-stationary training issue of multi-agent tasks. Experiments are conducted in multi-agent particle environments including cooperative communication, cooperative navigation, and competitive experiments.
arXiv Detail & Related papers (2020-05-11T21:21:50Z)
Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments. We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data. Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.