Related papers: Distributed Primal-Dual Algorithms: Unification, Connections, and Insights

Distributed Primal-Dual Algorithms: Unification, Connections, and Insights

URL: http://arxiv.org/abs/2502.00470v1
Date: Sat, 01 Feb 2025 15:56:11 GMT
Title: Distributed Primal-Dual Algorithms: Unification, Connections, and Insights
Authors: Runxiong Wu, Dong Liu, Xueqin Wang, Andi Wang,
Abstract summary: We study primal-dual algorithms for general empirical risk minimization problems in distributed settings.<n>We show that both classes of algorithms can be transformed into a unified update form that involves only primal and dual variables.
Score: 5.87788855821817
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study primal-dual algorithms for general empirical risk minimization problems in distributed settings, focusing on two prominent classes of algorithms. The first class is the communication-efficient distributed dual coordinate ascent (CoCoA), derived from the coordinate ascent method for solving the dual problem. The second class is the alternating direction method of multipliers (ADMM), including consensus ADMM, linearized ADMM, and proximal ADMM. We demonstrate that both classes of algorithms can be transformed into a unified update form that involves only primal and dual variables. This discovery reveals key connections between the two classes of algorithms: CoCoA can be interpreted as a special case of proximal ADMM for solving the dual problem, while consensus ADMM is closely related to a proximal ADMM algorithm. This discovery provides the insight that by adjusting the augmented Lagrangian parameter, we can easily enable the ADMM variants to outperform the CoCoA variants. We further explore linearized versions of ADMM and analyze the effects of tuning parameters on these ADMM variants in the distributed setting. Our theoretical findings are supported by extensive simulation studies and real-world data analysis.

Related papers

Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning [50.92957910121088]
This work designs and analyzes a novel set of algorithms for multi-agent reinforcement learning (MARL) based on the principle of information-directed sampling (IDS) For episodic two-player zero-sum MGs, we present three sample-efficient algorithms for learning Nash equilibrium. We extend Reg-MAIDS to multi-player general-sum MGs and prove that it can learn either the Nash equilibrium or coarse correlated equilibrium in a sample efficient manner.
arXiv Detail & Related papers (2024-04-30T06:48:56Z)
Lighter, Better, Faster Multi-Source Domain Adaptation with Gaussian Mixture Models and Optimal Transport [5.492296610282042]
We tackle Multi-Source Domain Adaptation (MSDA), a task in transfer learning where one adapts multiple heterogeneous, labeled source probability measures towards a different, unlabeled target measure. We propose a novel framework for MSDA, based on Optimal Transport (OT) and Gaussian Mixture Models (GMMs) We empirically evaluate our proposed methods on four benchmarks in image classification and fault diagnosis, showing that we improve over the prior art while being faster and involving fewer parameters.
arXiv Detail & Related papers (2024-04-16T03:31:28Z)
Meta-Learning Adversarial Bandit Algorithms [55.72892209124227]
We study online meta-learning with bandit feedback. We learn to tune online mirror descent generalization (OMD) with self-concordant barrier regularizers.
arXiv Detail & Related papers (2023-07-05T13:52:10Z)
Algorithm-Dependent Bounds for Representation Learning of Multi-Source Domain Adaptation [7.6249291891777915]
We use information-theoretic tools to derive a novel analysis of Multi-source Domain Adaptation (MDA) from the representation learning perspective. We propose a novel deep MDA algorithm, implicitly addressing the target shift through joint alignment. The proposed algorithm has comparable performance to the state-of-the-art on target-shifted MDA benchmark with improved memory efficiency.
arXiv Detail & Related papers (2023-04-04T18:32:20Z)
FIXED: Frustratingly Easy Domain Generalization with Mixup [53.782029033068675]
Domain generalization (DG) aims to learn a generalizable model from multiple training domains such that it can perform well on unseen target domains. A popular strategy is to augment training data to benefit generalization through methods such as Mixupcitezhang 2018mixup. We propose a simple yet effective enhancement for Mixup-based DG, namely domain-invariant Feature mIXup (FIX) Our approach significantly outperforms nine state-of-the-art related methods, beating the best performing baseline by 6.5% on average in terms of test accuracy.
arXiv Detail & Related papers (2022-11-07T09:38:34Z)
ADPS: Asymmetric Distillation Post-Segmentation for Image Anomaly Detection [75.68023968735523]
Knowledge Distillation-based Anomaly Detection (KDAD) methods rely on the teacher-student paradigm to detect and segment anomalous regions. We propose an innovative approach called Asymmetric Distillation Post-Segmentation (ADPS) Our ADPS employs an asymmetric distillation paradigm that takes distinct forms of the same image as the input of the teacher-student networks. We show that ADPS significantly improves Average Precision (AP) metric by 9% and 20% on the MVTec AD and KolektorSDD2 datasets.
arXiv Detail & Related papers (2022-10-19T12:04:47Z)
A Distributed Algorithm for Measure-valued Optimization with Additive Objective [1.0965065178451106]
We propose a distributed nonvalued algorithm for solving measure-parametric optimization problems with additive objectives. The proposed algorithm comprises a two-layer alternating direction multipliers (ADMM) The overall algorithm realizes operator splitting gradient for flows in the manifold of probability measures.
arXiv Detail & Related papers (2022-02-17T23:09:41Z)
A Convergent ADMM Framework for Efficient Neural Network Training [17.764095204676973]
Alternating Direction Method of Multipliers (ADMM) has achieved tremendous success in many classification and regression applications. We propose a novel framework to solve a general neural network training problem via ADMM (dlADMM) to address these challenges simultaneously. Experiments on seven benchmark datasets demonstrate the convergence, efficiency, and effectiveness of our proposed dlADMM algorithm.
arXiv Detail & Related papers (2021-12-22T01:55:24Z)
Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in Edge Industrial IoT [106.83952081124195]
Reinforcement learning (RL) has been widely investigated and shown to be a promising solution for decision-making and optimal control processes. We propose an adaptive ADMM (asI-ADMM) algorithm and apply it to decentralized RL with edge-computing-empowered IIoT networks. Experiment results show that our proposed algorithms outperform the state of the art in terms of communication costs and scalability, and can well adapt to complex IoT environments.
arXiv Detail & Related papers (2021-06-30T16:49:07Z)
A Framework of Inertial Alternating Direction Method of Multipliers for Non-Convex Non-Smooth Optimization [17.553531291690025]
We propose an algorithmic framework dubbed alternating methods of multipliers (iADMM) for solving a class of non nonsmooth multiblock composite problems. Our framework employs the general-major surrogateization (MM) principle to update each block of variables to unify the convergence analysis of previous ADMM schemes.
arXiv Detail & Related papers (2021-02-10T13:55:28Z)
Differentially Private ADMM Algorithms for Machine Learning [38.648113004535155]
We study efficient differentially private alternating direction methods of multipliers (ADMM) via gradient perturbation. We propose the first differentially private ADMM (DP-ADMM) algorithm with performance guarantee of $(epsilon,delta)$-differential privacy.
arXiv Detail & Related papers (2020-10-31T01:37:24Z)
Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles. Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center. We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes. A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.