Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency
- URL: http://arxiv.org/abs/2502.05028v1
- Date: Fri, 07 Feb 2025 15:57:56 GMT
- Title: Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency
- Authors: Qixin Zhang, Zongqi Wan, Yu Yang, Li Shen, Dacheng Tao,
- Abstract summary: We present a $textbfMA-OSMA$ algorithm to transfer the discrete submodular problem into a continuous optimization.
We also introduce a projection-free $textbfMA-OSEA$ algorithm, which effectively utilizes the KL divergence by mixing a uniform distribution.
Our algorithms significantly improve the $(frac11+c)$-approximation provided by the state-of-the-art OSG algorithm.
- Score: 52.60557300927007
- License:
- Abstract: Coordinating multiple agents to collaboratively maximize submodular functions in unpredictable environments is a critical task with numerous applications in machine learning, robot planning and control. The existing approaches, such as the OSG algorithm, are often hindered by their poor approximation guarantees and the rigid requirement for a fully connected communication graph. To address these challenges, we firstly present a $\textbf{MA-OSMA}$ algorithm, which employs the multi-linear extension to transfer the discrete submodular maximization problem into a continuous optimization, thereby allowing us to reduce the strict dependence on a complete graph through consensus techniques. Moreover, $\textbf{MA-OSMA}$ leverages a novel surrogate gradient to avoid sub-optimal stationary points. To eliminate the computationally intensive projection operations in $\textbf{MA-OSMA}$, we also introduce a projection-free $\textbf{MA-OSEA}$ algorithm, which effectively utilizes the KL divergence by mixing a uniform distribution. Theoretically, we confirm that both algorithms achieve a regret bound of $\widetilde{O}(\sqrt{\frac{C_{T}T}{1-\beta}})$ against a $(\frac{1-e^{-c}}{c})$-approximation to the best comparator in hindsight, where $C_{T}$ is the deviation of maximizer sequence, $\beta$ is the spectral gap of the network and $c$ is the joint curvature of submodular objectives. This result significantly improves the $(\frac{1}{1+c})$-approximation provided by the state-of-the-art OSG algorithm. Finally, we demonstrate the effectiveness of our proposed algorithms through simulation-based multi-target tracking.
Related papers
- Optimistically Optimistic Exploration for Provably Efficient Infinite-Horizon Reinforcement and Imitation Learning [13.429541377715296]
We propose the first computationally efficient algorithm achieving near-optimal regret guarantees in infinite-horizon discounted linear Markov decision processes.
We show that, combined with a regularized approximate dynamic-programming scheme, the resulting algorithm achieves a regret of order $tildemathcalO (sqrtd3 (1 - gamma)- 7 / 2 T)$, where $T$ is the total number of sample transitions, $gamma in (0,1)$ is the discount factor, and $d$ is the feature dimensionality.
arXiv Detail & Related papers (2025-02-19T17:32:35Z) - Distributed Difference of Convex Optimization [1.2661010067882734]
A class of distributed problems involvingn$ agents with the local objective function at every agenti by the difference of the difference of $-difference on $-f_i$ and $-f_i$ are presented.
The DDC-Consensus algorithm is developed to solve the non-regularized distributed squares problem.
arXiv Detail & Related papers (2024-07-23T14:41:32Z) - Theoretical Study of Conflict-Avoidant Multi-Objective Reinforcement Learning [21.288881065839007]
We develop a novel dynamic weighting multi-task actor-critic algorithm (MTAC) under two options of sub-procedures named as CA and FC.
MTAC-CA aims to find a conflict-avoidant (CA) update direction that maximizes the minimum value improvement among tasks, and MTAC-FC targets at a much faster convergence rate.
Our experiments on MT10 demonstrate the improved performance of our algorithms over existing MTRL methods with fixed preference.
arXiv Detail & Related papers (2024-05-25T05:57:46Z) - Distributed Extra-gradient with Optimal Complexity and Communication
Guarantees [60.571030754252824]
We consider monotone variational inequality (VI) problems in multi-GPU settings where multiple processors/workers/clients have access to local dual vectors.
Extra-gradient, which is a de facto algorithm for monotone VI problems, has not been designed to be communication-efficient.
We propose a quantized generalized extra-gradient (Q-GenX), which is an unbiased and adaptive compression method tailored to solve VIs.
arXiv Detail & Related papers (2023-08-17T21:15:04Z) - Provably Efficient Reinforcement Learning via Surprise Bound [66.15308700413814]
We propose a provably efficient reinforcement learning algorithm (both computationally and statistically) with general value function approximations.
Our algorithm achieves reasonable regret bounds when applied to both the linear setting and the sparse high-dimensional linear setting.
arXiv Detail & Related papers (2023-02-22T20:21:25Z) - Differentially-Private Hierarchical Clustering with Provable
Approximation Guarantees [79.59010418610625]
We study differentially private approximation algorithms for hierarchical clustering.
We show strong lower bounds for the problem: that any $epsilon$-DP algorithm must exhibit $O(|V|2/ epsilon)$-additive error for an input dataset.
We propose a private $1+o(1)$ approximation algorithm which also recovers the blocks exactly.
arXiv Detail & Related papers (2023-01-31T19:14:30Z) - Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching.
Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z) - Communication-Efficient Decentralized Online Continuous DR-Submodular
Maximization [11.889570525184801]
We present two decentralized online algorithms for the monotone continuous DR-submodular-Frank problem.
The first one, One-shot Decentralized Meta-Wolfe (Mono-DMFW), achieves a $(1-1/e)$regret bound $O(T4/5)$.
Next, inspired by the non-oblivious boosting function citepzhang2022, we propose the Decentralized Online Boosting Gradient Ascent (DOBGA) algorithm.
arXiv Detail & Related papers (2022-08-18T07:32:28Z) - Minimax Optimization with Smooth Algorithmic Adversaries [59.47122537182611]
We propose a new algorithm for the min-player against smooth algorithms deployed by an adversary.
Our algorithm is guaranteed to make monotonic progress having no limit cycles, and to find an appropriate number of gradient ascents.
arXiv Detail & Related papers (2021-06-02T22:03:36Z) - Joint Optimization of Multi-Objective Reinforcement Learning with Policy
Gradient Based Algorithm [34.77250498401055]
We formulate the problem of maximizing a non-linear concave function of multiple long-term objectives.
A policy-gradient based model-free algorithm is proposed for the problem.
The proposed algorithm is shown to achieve convergence to within an $epsilon$ of the global optima.
arXiv Detail & Related papers (2021-05-28T22:20:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.