Related papers: CORE: Common Random Reconstruction for Distributed Optimization with Provable Low Communication Complexity

CORE: Common Random Reconstruction for Distributed Optimization with Provable Low Communication Complexity

URL: http://arxiv.org/abs/2309.13307v1
Date: Sat, 23 Sep 2023 08:45:27 GMT
Title: CORE: Common Random Reconstruction for Distributed Optimization with Provable Low Communication Complexity
Authors: Pengyun Yue, Hanzhen Zhao, Cong Fang, Di He, Liwei Wang, Zhouchen Lin, Song-chun Zhu
Abstract summary: Communication complexity has become a major bottleneck for speeding up training and scaling up machine numbers. We propose Common Om REOm, which can be used to compress information transmitted between machines.
Score: 110.50364486645852
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With distributed machine learning being a prominent technique for large-scale machine learning tasks, communication complexity has become a major bottleneck for speeding up training and scaling up machine numbers. In this paper, we propose a new technique named Common randOm REconstruction(CORE), which can be used to compress the information transmitted between machines in order to reduce communication complexity without other strict conditions. Especially, our technique CORE projects the vector-valued information to a low-dimensional one through common random vectors and reconstructs the information with the same random noises after communication. We apply CORE to two distributed tasks, respectively convex optimization on linear models and generic non-convex optimization, and design new distributed algorithms, which achieve provably lower communication complexities. For example, we show for linear models CORE-based algorithm can encode the gradient vector to $\mathcal{O}(1)$-bits (against $\mathcal{O}(d)$), with the convergence rate not worse, preceding the existing results.

Related papers

Deep-ICE: The first globally optimal algorithm for empirical risk minimization of two-layer maxout and ReLU networks [1.7266553199919665]
This paper introduces the first globally optimal algorithm for the empirical risk problem of two-layer maxout and ReLU networks.<n>The proposed algorithm provides provably exact solutions for small-scale datasets.<n>To handle larger datasets, we introduce a novel coreset selection method that reduces the data size to a manageable scale.
arXiv Detail & Related papers (2025-05-09T02:34:54Z)
Optimizing the Optimal Weighted Average: Efficient Distributed Sparse Classification [50.406127962933915]
ACOWA allows an extra round of communication to achieve noticeably better approximation quality with minor runtime increases. Results show that ACOWA obtains solutions that are more faithful to the empirical risk minimizer and attain substantially higher accuracy than other distributed algorithms.
arXiv Detail & Related papers (2024-06-03T19:43:06Z)
An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks. The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions. We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z)
Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching. Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z)
Decentralized Gossip-Based Stochastic Bilevel Optimization over Communication Networks [42.76623191830371]
We propose a gossip-based distributed bilevel optimization algorithm. Agents can solve both networked and outer problems in a single time. Our algorithm achieves the state-of-the-art efficiency and test accuracy.
arXiv Detail & Related papers (2022-06-22T06:38:54Z)
Combinatorial optimization for low bit-width neural networks [23.466606660363016]
Low-bit width neural networks have been extensively explored for deployment on edge devices to reduce computational resources. Existing approaches have focused on gradient-based optimization in a two-stage train-and-compress setting. We show that a combination of greedy coordinate descent and this novel approach can attain competitive accuracy on binary classification tasks.
arXiv Detail & Related papers (2022-06-04T15:02:36Z)
Smoothness Matrices Beat Smoothness Constants: Better Communication Compression Techniques for Distributed Optimization [10.592277756185046]
Large scale distributed optimization has become the default tool for the training of supervised machine learning models. We propose a novel communication sparsification strategy that can take full advantage of the smoothness matrices associated with local losses.
arXiv Detail & Related papers (2021-02-14T20:55:02Z)
Recovery of Linear Components: Reduced Complexity Autoencoder Designs [0.951828574518325]
We present an approach called Recovery of Linear Components (RLC), which serves as a middle ground between linear and non-linear dimensionality reduction techniques. With the aid of synthetic and real world case studies, we show that the RLC, when compared with an autoencoder of similar complexity, shows higher accuracy, similar to robustness to overfitting, and faster training times.
arXiv Detail & Related papers (2020-12-14T14:08:20Z)
A Robust Matching Pursuit Algorithm Using Information Theoretic Learning [37.968665739578185]
A new OMP algorithm is developed based on the information theoretic learning (ITL) The experimental results on both simulated and real-world data consistently demonstrate the superiority of the proposed OMP algorithm in data recovery, image reconstruction, and classification.
arXiv Detail & Related papers (2020-05-10T01:36:00Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
Channel Assignment in Uplink Wireless Communication using Machine Learning Approach [54.012791474906514]
This letter investigates a channel assignment problem in uplink wireless communication systems. Our goal is to maximize the sum rate of all users subject to integer channel assignment constraints. Due to high computational complexity, machine learning approaches are employed to obtain computational efficient solutions.
arXiv Detail & Related papers (2020-01-12T15:54:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.