Distributed Kernel Ridge Regression with Communications
- URL: http://arxiv.org/abs/2003.12210v1
- Date: Fri, 27 Mar 2020 02:42:43 GMT
- Title: Distributed Kernel Ridge Regression with Communications
- Authors: Shao-Bo Lin, Di Wang, Ding-Xuan Zhou
- Abstract summary: This paper focuses on generalization performance analysis for distributed algorithms in the framework of learning theory.
We derive optimal learning rates in expectation and provide theoretically optimal ranges of the number of local processors.
We propose a communication strategy to improve the learning performance of DKRR and demonstrate the power of communications in DKRR.
- Score: 27.754994709225425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper focuses on generalization performance analysis for distributed
algorithms in the framework of learning theory. Taking distributed kernel ridge
regression (DKRR) for example, we succeed in deriving its optimal learning
rates in expectation and providing theoretically optimal ranges of the number
of local processors. Due to the gap between theory and experiments, we also
deduce optimal learning rates for DKRR in probability to essentially reflect
the generalization performance and limitations of DKRR. Furthermore, we propose
a communication strategy to improve the learning performance of DKRR and
demonstrate the power of communications in DKRR via both theoretical
assessments and numerical experiments.
Related papers
- Stability-based Generalization Analysis of Randomized Coordinate Descent for Pairwise Learning [30.557503207329965]
We consider the generalization of RCD for pairwise learning.
We measure the on-average argument stability for both convex and strongly convex objective functions.
arXiv Detail & Related papers (2025-03-03T13:39:06Z) - An In-depth Investigation of Sparse Rate Reduction in Transformer-like Models [32.04194224236952]
We propose an information-theoretic objective function called Sparse Rate Reduction (SRR)
We show that SRR has a positive correlation coefficient and outperforms other baseline measures, such as path-norm and sharpness-based ones.
We show that generalization can be improved using SRR as regularization on benchmark image classification datasets.
arXiv Detail & Related papers (2024-11-26T07:44:57Z) - Lepskii Principle for Distributed Kernel Ridge Regression [16.389581549801253]
We propose a Lepskii principle to equip distributed kernel ridge regression (DKRR)
We deduce optimal learning rates for Lep-AdaDKRR and theoretically show that Lep-AdaDKRR succeeds in adapting to the regularity of regression functions.
arXiv Detail & Related papers (2024-09-08T12:12:18Z) - On the Generalization Capability of Temporal Graph Learning Algorithms:
Theoretical Insights and a Simpler Method [59.52204415829695]
Temporal Graph Learning (TGL) has become a prevalent technique across diverse real-world applications.
This paper investigates the generalization ability of different TGL algorithms.
We propose a simplified TGL network, which enjoys a small generalization error, improved overall performance, and lower model complexity.
arXiv Detail & Related papers (2024-02-26T08:22:22Z) - Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint [56.74058752955209]
This paper studies the alignment process of generative models with Reinforcement Learning from Human Feedback (RLHF)
We first identify the primary challenges of existing popular methods like offline PPO and offline DPO as lacking in strategical exploration of the environment.
We propose efficient algorithms with finite-sample theoretical guarantees.
arXiv Detail & Related papers (2023-12-18T18:58:42Z) - Provable Reward-Agnostic Preference-Based Reinforcement Learning [61.39541986848391]
Preference-based Reinforcement Learning (PbRL) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories.
We propose a theoretical reward-agnostic PbRL framework where exploratory trajectories that enable accurate learning of hidden reward functions are acquired.
arXiv Detail & Related papers (2023-05-29T15:00:09Z) - Scalable Optimal Margin Distribution Machine [50.281535710689795]
Optimal margin Distribution Machine (ODM) is a newly proposed statistical learning framework rooting in the novel margin theory.
This paper proposes a scalable ODM, which can achieve nearly ten times speedup compared to the original ODM training method.
arXiv Detail & Related papers (2023-05-08T16:34:04Z) - Less is More: Rethinking Few-Shot Learning and Recurrent Neural Nets [2.824895388993495]
We provide theoretical guarantees for reliable learning under the information-theoretic AEP.
We then focus on a highly efficient recurrent neural net (RNN) framework and propose a reduced-entropy algorithm for few-shot learning.
Our experimental results demonstrate significant potential for improving learning models' sample efficiency, generalization, and time complexity.
arXiv Detail & Related papers (2022-09-28T17:33:11Z) - A Rationale-Centric Framework for Human-in-the-loop Machine Learning [12.793695970529138]
We present a novel rationale-centric framework with human-in-the-loop -- Rationales-centric Double-robustness Learning (RDL)
RDL exploits rationales (i.e. phrases that cause the prediction), human interventions and semi-factual augmentations to decouple spurious associations and bias models towards generally applicable underlying distributions.
arXiv Detail & Related papers (2022-03-24T08:12:57Z) - Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues.
We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders.
We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z) - Millimeter Wave Communications with an Intelligent Reflector:
Performance Optimization and Distributional Reinforcement Learning [119.97450366894718]
A novel framework is proposed to optimize the downlink multi-user communication of a millimeter wave base station.
A channel estimation approach is developed to measure the channel state information (CSI) in real-time.
A distributional reinforcement learning (DRL) approach is proposed to learn the optimal IR reflection and maximize the expectation of downlink capacity.
arXiv Detail & Related papers (2020-02-24T22:18:54Z) - COKE: Communication-Censored Decentralized Kernel Learning [30.795725108364724]
Multiple interconnected agents aim to learn an optimal decision function defined over a reproducing kernel Hilbert space by jointly minimizing a global objective function.
As a non-parametric approach, kernel iteration learning faces a major challenge in distributed implementation.
We develop a communication-censored kernel learning (COKE) algorithm that reduces the communication load of DKLA by preventing an agent from transmitting at every generalization unless its local updates are deemed informative.
arXiv Detail & Related papers (2020-01-28T01:05:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.