Decentralized AI: Permissionless LLM Inference on POKT Network
- URL: http://arxiv.org/abs/2405.20450v1
- Date: Thu, 30 May 2024 19:50:07 GMT
- Title: Decentralized AI: Permissionless LLM Inference on POKT Network
- Authors: Daniel Olshansky, Ramiro Rodriguez Colmeiro, Bowen Li,
- Abstract summary: POKT Network's decentralized Remote Procedure Call infrastructure has surpassed 740 billion requests since launching on MainNet in 2020.
This litepaper illustrates how the network's open-source and permissionless design aligns incentives among model researchers, hardware operators, API providers and users.
- Score: 8.68822221491139
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: POKT Network's decentralized Remote Procedure Call (RPC) infrastructure, surpassing 740 billion requests since launching on MainNet in 2020, is well-positioned to extend into providing AI inference services with minimal design or implementation modifications. This litepaper illustrates how the network's open-source and permissionless design aligns incentives among model researchers, hardware operators, API providers and users whom we term model Sources, Suppliers, Gateways and Applications respectively. Through its Relay Mining algorithm, POKT creates a transparent marketplace where costs and earnings directly reflect cryptographically verified usage. This decentralized framework offers large model AI researchers a new avenue to disseminate their work and generate revenue without the complexities of maintaining infrastructure or building end-user products. Supply scales naturally with demand, as evidenced in recent years and the protocol's free market dynamics. POKT Gateways facilitate network growth, evolution, adoption, and quality by acting as application-facing load balancers, providing value-added features without managing LLM nodes directly. This vertically decoupled network, battle tested over several years, is set up to accelerate the adoption, operation, innovation and financialization of open-source models. It is the first mature permissionless network whose quality of service competes with centralized entities set up to provide application grade inference.
Related papers
- Intelligent Mobile AI-Generated Content Services via Interactive Prompt Engineering and Dynamic Service Provisioning [55.641299901038316]
AI-generated content can organize collaborative Mobile AIGC Service Providers (MASPs) at network edges to provide ubiquitous and customized content for resource-constrained users.
Such a paradigm faces two significant challenges: 1) raw prompts often lead to poor generation quality due to users' lack of experience with specific AIGC models, and 2) static service provisioning fails to efficiently utilize computational and communication resources.
We develop an interactive prompt engineering mechanism that leverages a Large Language Model (LLM) to generate customized prompt corpora and employs Inverse Reinforcement Learning (IRL) for policy imitation.
arXiv Detail & Related papers (2025-02-17T03:05:20Z) - NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals [58.83169560132308]
We introduce NNsight and NDIF, technologies that work in tandem to enable scientific study of very large neural networks.
NNsight is an open-source system that extends PyTorch to introduce deferred remote execution.
NDIF is a scalable inference service that executes NNsight requests, allowing users to share GPU resources and pretrained models.
arXiv Detail & Related papers (2024-07-18T17:59:01Z) - A Learning-based Incentive Mechanism for Mobile AIGC Service in Decentralized Internet of Vehicles [49.86094523878003]
We propose a decentralized incentive mechanism for mobile AIGC service allocation.
We employ multi-agent deep reinforcement learning to find the balance between the supply of AIGC services on RSUs and user demand for services within the IoV context.
arXiv Detail & Related papers (2024-03-29T12:46:07Z) - MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT [87.4910758026772]
"Bigger the better" has been the predominant trend in recent Large Language Models (LLMs) development.
This paper explores the "less is more" paradigm by addressing the challenge of designing accurate yet efficient Small Language Models (SLMs) for resource constrained devices.
arXiv Detail & Related papers (2024-02-26T18:59:03Z) - Elastic Entangled Pair and Qubit Resource Management in Quantum Cloud
Computing [73.7522199491117]
Quantum cloud computing (QCC) offers a promising approach to efficiently provide quantum computing resources.
The fluctuations in user demand and quantum circuit requirements are challenging for efficient resource provisioning.
We propose a resource allocation model to provision quantum computing and networking resources.
arXiv Detail & Related papers (2023-07-25T00:38:46Z) - Relay Mining: Incentivizing Full Non-Validating Nodes Servicing All RPC Types [0.0]
Relay Mining estimates and proves the volume of Remote Procedure Calls (RPCs) made from a client to a server.
We leverage digital signatures, commit-and-reveal schemes, and Sparse Merkle Sum Tries (SMSTs) to prove the amount of work done.
A native cryptocurrency on a distributed ledger is used to rate limit applications and disincentivize over-usage.
arXiv Detail & Related papers (2023-05-18T03:23:41Z) - Deep Recurrent Learning Through Long Short Term Memory and TOPSIS [0.0]
Cloud computing's cheap, easy and quick management promise pushes business-owners for a transition from monolithic to a data-center/cloud based ERP.
Since cloud-ERP development involves a cyclic process, namely planning, implementing, testing and upgrading, its adoption is realized as a deep recurrent neural network problem.
Our theoretical model is validated over a reference model by articulating key players, services, architecture, functionalities.
arXiv Detail & Related papers (2022-12-30T10:35:25Z) - Evaluation of a blockchain-enabled resource management mechanism for
NGNs [0.0]
This paper examines the use of blockchain technology for resource management and negotiation among Network Providers (NPs)
The implementation of the resource management mechanism is described in a Smart Contract (SC) and the testbeds use the Raft and the IBFT consensus mechanisms respectively.
arXiv Detail & Related papers (2022-11-01T13:40:26Z) - KAIROS: Building Cost-Efficient Machine Learning Inference Systems with
Heterogeneous Cloud Resources [10.462798429064277]
KAIROS is a novel runtime framework that maximizes the query throughput while meeting target and a cost budget.
Our evaluation using industry-grade deep learning (DL) models shows that KAIROS yields up to 2X the throughput of an optimal homogeneous solution.
arXiv Detail & Related papers (2022-10-12T03:06:51Z) - AI in 6G: Energy-Efficient Distributed Machine Learning for Multilayer
Heterogeneous Networks [7.318997639507269]
We propose a novel layer-based HetNet architecture which distributes tasks associated with different machine learning approaches across network layers and entities.
Such a HetNet boasts multiple access schemes as well as device-to-device (D2D) communications to enhance energy efficiency.
arXiv Detail & Related papers (2022-06-04T22:03:19Z) - Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in
Edge Industrial IoT [106.83952081124195]
Reinforcement learning (RL) has been widely investigated and shown to be a promising solution for decision-making and optimal control processes.
We propose an adaptive ADMM (asI-ADMM) algorithm and apply it to decentralized RL with edge-computing-empowered IIoT networks.
Experiment results show that our proposed algorithms outperform the state of the art in terms of communication costs and scalability, and can well adapt to complex IoT environments.
arXiv Detail & Related papers (2021-06-30T16:49:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.