Related papers: A GPU-Accelerated Bi-linear ADMM Algorithm for Distributed Sparse Machine Learning

A GPU-Accelerated Bi-linear ADMM Algorithm for Distributed Sparse Machine Learning

URL: http://arxiv.org/abs/2405.16267v2
Date: Wed, 26 Jun 2024 07:27:56 GMT
Title: A GPU-Accelerated Bi-linear ADMM Algorithm for Distributed Sparse Machine Learning
Authors: Alireza Olama, Andreas Lundell, Jan Kronqvist, Elham Ahmadi, Eduardo Camponogara,
Abstract summary: Bi-cADMM is aimed at solving large-scale regularized Sparse Machine Learning problems defined over a network of computational nodes. Bi-cADMM is implemented within an open-source Python package called Parallel Sparse Fitting Toolbox.
Score: 4.258375398293221
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces the Bi-linear consensus Alternating Direction Method of Multipliers (Bi-cADMM), aimed at solving large-scale regularized Sparse Machine Learning (SML) problems defined over a network of computational nodes. Mathematically, these are stated as minimization problems with convex local loss functions over a global decision vector, subject to an explicit $\ell_0$ norm constraint to enforce the desired sparsity. The considered SML problem generalizes different sparse regression and classification models, such as sparse linear and logistic regression, sparse softmax regression, and sparse support vector machines. Bi-cADMM leverages a bi-linear consensus reformulation of the original non-convex SML problem and a hierarchical decomposition strategy that divides the problem into smaller sub-problems amenable to parallel computing. In Bi-cADMM, this decomposition strategy is based on a two-phase approach. Initially, it performs a sample decomposition of the data and distributes local datasets across computational nodes. Subsequently, a delayed feature decomposition of the data is conducted on Graphics Processing Units (GPUs) available to each node. This methodology allows Bi-cADMM to undertake computationally intensive data-centric computations on GPUs, while CPUs handle more cost-effective computations. The proposed algorithm is implemented within an open-source Python package called Parallel Sparse Fitting Toolbox (PsFiT), which is publicly available. Finally, computational experiments demonstrate the efficiency and scalability of our algorithm through numerical benchmarks across various SML problems featuring distributed datasets.

Related papers

Low-rank computation of the posterior mean in Multi-Output Gaussian Processes [0.6445605125467574]
We consider the case of multi-output Gaussian processes (MOGP) and present low-rank approaches for efficiently computing the posterior mean of a MOGP.
arXiv Detail & Related papers (2025-04-30T11:19:58Z)
Efficient Distributed Learning over Decentralized Networks with Convoluted Support Vector Machine [2.722434989508884]
This paper addresses the problem of efficiently classifying high-dimensional data over decentralized networks. We develop an efficient generalized alternating direction method of multipliers (ADMM) algorithm for solving penalized SVM.
arXiv Detail & Related papers (2025-03-10T17:31:26Z)
Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure. We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z)
Distributed Linear Regression with Compositional Covariates [5.085889377571319]
We focus on the distributed sparse penalized linear log-contrast model in massive compositional data. Two distributed optimization techniques are proposed for solving the two different constrained convex optimization problems. In the decentralized topology, we introduce a distributed coordinate-wise descent algorithm for obtaining a communication-efficient regularized estimation.
arXiv Detail & Related papers (2023-10-21T11:09:37Z)
An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks. The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions. We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z)
Sublinear Time Algorithms for Several Geometric Optimization (With Outliers) Problems In Machine Learning [8.794896028549323]
We revisit the Minimum Enclosing Ball (MEB) problem in Euclidean space $mathbbRd$. We introduce the notion of stability for MEB, which is natural and easy to understand. Under the stability assumption, we present two sampling algorithms for computing radius-approximate MEB.
arXiv Detail & Related papers (2023-01-07T15:03:45Z)
High-Dimensional Sparse Bayesian Learning without Covariance Matrices [66.60078365202867]
We introduce a new inference scheme that avoids explicit construction of the covariance matrix. Our approach couples a little-known diagonal estimation result from numerical linear algebra with the conjugate gradient algorithm. On several simulations, our method scales better than existing approaches in computation time and memory.
arXiv Detail & Related papers (2022-02-25T16:35:26Z)
Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer. In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph. Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z)
Piecewise linear regression and classification [0.20305676256390928]
This paper proposes a method for solving multivariate regression and classification problems using piecewise linear predictors. A Python implementation of the algorithm described in this paper is available at http://cse.lab.imtlucca.it/bemporad/parc.
arXiv Detail & Related papers (2021-03-10T17:07:57Z)
Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization. We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z)
Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles. Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center. We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes. A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z)
Communication-efficient distributed eigenspace estimation [31.69089186688224]
We develop a communication-efficient distributed algorithm for computing the leading invariant subspace of a data matrix. Our algorithm uses a novel alignment scheme that minimizes the Procrustean distance between local solutions and a reference solution. We show that our algorithm achieves a similar error rate to that of a centralized estimator.
arXiv Detail & Related papers (2020-09-05T02:11:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.