Related papers: Incentivizing Permissionless Distributed Learning of LLMs

Incentivizing Permissionless Distributed Learning of LLMs

URL: http://arxiv.org/abs/2505.21684v1
Date: Tue, 27 May 2025 19:11:22 GMT
Title: Incentivizing Permissionless Distributed Learning of LLMs
Authors: Joel Lidin, Amir Sarfi, Evangelos Pappas, Samuel Dare, Eugene Belilovsky, Jacob Steeves,
Abstract summary: textitGauntlet can be applied to any synchronous distributed training scheme that relies on aggregating updates or pseudo-gradients.<n>We utilize an OpenSkill rating system to track competitiveness of pseudo-gradient scores across time.<n>Our live 1.2B run, which has paid out real-valued tokens to participants based on the value of their contributions, demonstrates the utility of our incentive system.
Score: 7.36110927499488
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We describe an incentive system for distributed deep learning of foundational models where peers are rewarded for contributions. The incentive system, \textit{Gauntlet}, has been deployed on the bittensor blockchain and used to train a 1.2B LLM with completely permissionless contributions of pseudo-gradients: no control over the users that can register or their hardware. \textit{Gauntlet} can be applied to any synchronous distributed training scheme that relies on aggregating updates or pseudo-gradients. We rely on a two-stage mechanism for fast filtering of peer uptime, reliability, and synchronization, combined with the core component that estimates the loss before and after individual pseudo-gradient contributions. We utilized an OpenSkill rating system to track competitiveness of pseudo-gradient scores across time. Finally, we introduce a novel mechanism to ensure peers on the network perform unique computations. Our live 1.2B run, which has paid out real-valued tokens to participants based on the value of their contributions, yielded a competitive (on a per-iteration basis) 1.2B model that demonstrates the utility of our incentive system.

Related papers

Blockchain-based Framework for Scalable and Incentivized Federated Learning [0.820828081284034]
Federated Learning (FL) enables collaborative model training without sharing raw data, preserving privacy while harnessing distributed datasets.<n>Traditional FL systems often rely on centralized aggregating mechanisms, introducing trust issues, single points of failure, and limited mechanisms for incentivizing meaningful client contributions.<n>This paper presents a blockchain-based FL framework that addresses these limitations by integrating smart contracts and a novel hybrid incentive mechanism.
arXiv Detail & Related papers (2025-02-20T00:38:35Z)
Proof-of-Collaborative-Learning: A Multi-winner Federated Learning Consensus Algorithm [2.5203968759841158]
We propose Proof-of-Collaborative-Learning (PoCL), a multi-winner federated learning validated consensus mechanism. PoCL redirects the power of blockchains to train federated learning models. We present a novel evaluation mechanism to ensure the efficiency of the locally trained models of miners.
arXiv Detail & Related papers (2024-07-17T21:14:05Z)
Dense Reward for Free in Reinforcement Learning from Human Feedback [64.92448888346125]
We leverage the fact that the reward model contains more information than just its scalar output. We use these attention weights to redistribute the reward along the whole completion. Empirically, we show that it stabilises training, accelerates the rate of learning, and, in practical cases, may lead to better local optima.
arXiv Detail & Related papers (2024-02-01T17:10:35Z)
Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice. We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z)
Unified Classification and Rejection: A One-versus-All Framework [47.58109235690227]
We build a unified framework for building open set classifiers for both classification and OOD rejection. By decomposing the $ K $-class problem into $ K $ one-versus-all (OVA) binary classification tasks, we show that combining the scores of OVA classifiers can give $ (K+1) $-class posterior probabilities. Experiments on popular OSR and OOD detection datasets demonstrate that the proposed framework, using a single multi-class classifier, yields competitive performance.
arXiv Detail & Related papers (2023-11-22T12:47:12Z)
Fair yet Asymptotically Equal Collaborative Learning [32.588043205577435]
In collaborative learning with streaming data, nodes jointly and continuously learn a machine learning (ML) model by sharing the latest model updates computed from their latest streaming data. This paper explores an incentive design that guarantees fairness so that nodes receive rewards commensurate to their contributions. We empirically demonstrate in two settings with real-world streaming data, that our proposed approach outperforms existing baselines in fairness and learning performance while remaining competitive in preserving equality.
arXiv Detail & Related papers (2023-06-09T08:57:14Z)
Distributional Reinforcement Learning with Dual Expectile-Quantile Regression [51.87411935256015]
quantile regression approach to distributional RL provides flexible and effective way of learning arbitrary return distributions.<n>We show that distributional estimation guarantees vanish, and we empirically observe that the estimated distribution rapidly collapses to its mean.<n>Motivated by the efficiency of $L$-based learning, we propose to jointly learn expectiles and quantiles of the return distribution in a way that allows efficient learning.
arXiv Detail & Related papers (2023-05-26T12:30:05Z)
Distributional Reward Estimation for Effective Multi-Agent Deep Reinforcement Learning [19.788336796981685]
We propose a novel Distributional Reward Estimation framework for effective Multi-Agent Reinforcement Learning (DRE-MARL) Our main idea is to design the multi-action-branch reward estimation and policy-weighted reward aggregation for stabilized training. The superiority of the DRE-MARL is demonstrated using benchmark multi-agent scenarios, compared with the SOTA baselines in terms of both effectiveness and robustness.
arXiv Detail & Related papers (2022-10-14T08:31:45Z)
Simultaneous Double Q-learning with Conservative Advantage Learning for Actor-Critic Methods [133.85604983925282]
We propose Simultaneous Double Q-learning with Conservative Advantage Learning (SDQ-CAL) Our algorithm realizes less biased value estimation and achieves state-of-the-art performance in a range of continuous control benchmark tasks.
arXiv Detail & Related papers (2022-05-08T09:17:16Z)
FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training with Dynamic Sparsity [74.58777701536668]
We introduce the FreeTickets concept, which can boost the performance of sparse convolutional neural networks over their dense network equivalents by a large margin. We propose two novel efficient ensemble methods with dynamic sparsity, which yield in one shot many diverse and accurate tickets "for free" during the sparse training process.
arXiv Detail & Related papers (2021-06-28T10:48:20Z)
Reward-Based 1-bit Compressed Federated Distillation on Blockchain [14.365210947456209]
Recent advent of various forms of Federated Knowledge Distillation (FD) paves the way for a new generation of robust and communication-efficient Federated Learning (FL) We introduce a novel decentralized federated learning framework where heavily compressed 1-bit soft-labels are aggregated on a smart contract. In a context where workers' contributions are now easily comparable, we modify the Peer Truth Serum for Crowdsourcing mechanism (PTSC) for FD to reward honest participation.
arXiv Detail & Related papers (2021-06-27T15:51:04Z)
Training Generative Adversarial Networks in One Stage [58.983325666852856]
We introduce a general training scheme that enables training GANs efficiently in only one stage. We show that the proposed method is readily applicable to other adversarial-training scenarios, such as data-free knowledge distillation.
arXiv Detail & Related papers (2021-02-28T09:03:39Z)
2CP: Decentralized Protocols to Transparently Evaluate Contributivity in Blockchain Federated Learning Environments [9.885896204530878]
We introduce 2CP, a framework comprising two novel protocols for Federated Learning. Crowdsource Protocol allows an actor to bring a model forward for training, and use their own data to evaluate the contributions made to it. The Consortium Protocol gives trainers the same guarantee even when no party owns the initial model and no dataset is available.
arXiv Detail & Related papers (2020-11-15T12:59:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.