Related papers: Decentralized AI: Permissionless LLM Inference on POKT Network

Decentralized AI: Permissionless LLM Inference on POKT Network

URL: http://arxiv.org/abs/2405.20450v1
Date: Thu, 30 May 2024 19:50:07 GMT
Title: Decentralized AI: Permissionless LLM Inference on POKT Network
Authors: Daniel Olshansky, Ramiro Rodriguez Colmeiro, Bowen Li,
Abstract summary: POKT Network's decentralized Remote Procedure Call infrastructure has surpassed 740 billion requests since launching on MainNet in 2020. This litepaper illustrates how the network's open-source and permissionless design aligns incentives among model researchers, hardware operators, API providers and users.
Score: 8.68822221491139
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: POKT Network's decentralized Remote Procedure Call (RPC) infrastructure, surpassing 740 billion requests since launching on MainNet in 2020, is well-positioned to extend into providing AI inference services with minimal design or implementation modifications. This litepaper illustrates how the network's open-source and permissionless design aligns incentives among model researchers, hardware operators, API providers and users whom we term model Sources, Suppliers, Gateways and Applications respectively. Through its Relay Mining algorithm, POKT creates a transparent marketplace where costs and earnings directly reflect cryptographically verified usage. This decentralized framework offers large model AI researchers a new avenue to disseminate their work and generate revenue without the complexities of maintaining infrastructure or building end-user products. Supply scales naturally with demand, as evidenced in recent years and the protocol's free market dynamics. POKT Gateways facilitate network growth, evolution, adoption, and quality by acting as application-facing load balancers, providing value-added features without managing LLM nodes directly. This vertically decoupled network, battle tested over several years, is set up to accelerate the adoption, operation, innovation and financialization of open-source models. It is the first mature permissionless network whose quality of service competes with centralized entities set up to provide application grade inference.

Related papers

AI/ML Life Cycle Management for Interoperable AI Native RAN [50.61227317567369]
Artificial intelligence (AI) and machine learning (ML) models are rapidly permeating the 5G Radio Access Network (RAN)<n>These developments lay the foundation for AI-native transceivers as a key enabler for 6G.
arXiv Detail & Related papers (2025-07-24T16:04:59Z)
Symbiotic Agents: A Novel Paradigm for Trustworthy AGI-driven Networks [2.5782420501870296]
Large Language Model (LLM)-based autonomous agents are expected to play a vital role in the evolution of 6G networks.<n>We introduce a novel agentic paradigm that combines LLMs real-time optimization algorithms towards Trustworthy AI.<n>We propose an end-to-end architecture for AGI networks and evaluate it on a 5G testbed capturing channel fluctuations from moving vehicles.
arXiv Detail & Related papers (2025-07-23T17:01:23Z)
GenTorrent: Scaling Large Language Model Serving with An Overley Network [35.05892538683356]
We propose GenTorrent, an LLM serving overlay that harnesses computing resources from decentralized contributors. We identify four key research problems inherent to enabling such a decentralized infrastructure. We believe this work pioneers a new direction for democratizing and scaling future AI serving capabilities.
arXiv Detail & Related papers (2025-04-27T01:08:25Z)
Large-Scale AI in Telecom: Charting the Roadmap for Innovation, Scalability, and Enhanced Digital Experiences [212.5544743797899]
Large Telecom Models (LTMs) are tailored AI models designed to address the complex challenges faced by modern telecom networks. The paper covers a wide range of topics, from the architecture and deployment strategies of LTMs to their applications in network management, resource allocation, and optimization.
arXiv Detail & Related papers (2025-03-06T07:53:24Z)
Intelligent Mobile AI-Generated Content Services via Interactive Prompt Engineering and Dynamic Service Provisioning [55.641299901038316]
AI-generated content can organize collaborative Mobile AIGC Service Providers (MASPs) at network edges to provide ubiquitous and customized content for resource-constrained users. Such a paradigm faces two significant challenges: 1) raw prompts often lead to poor generation quality due to users' lack of experience with specific AIGC models, and 2) static service provisioning fails to efficiently utilize computational and communication resources. We develop an interactive prompt engineering mechanism that leverages a Large Language Model (LLM) to generate customized prompt corpora and employs Inverse Reinforcement Learning (IRL) for policy imitation.
arXiv Detail & Related papers (2025-02-17T03:05:20Z)
A Learning-based Incentive Mechanism for Mobile AIGC Service in Decentralized Internet of Vehicles [49.86094523878003]
We propose a decentralized incentive mechanism for mobile AIGC service allocation. We employ multi-agent deep reinforcement learning to find the balance between the supply of AIGC services on RSUs and user demand for services within the IoV context.
arXiv Detail & Related papers (2024-03-29T12:46:07Z)
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT [87.4910758026772]
"Bigger the better" has been the predominant trend in recent Large Language Models (LLMs) development. This paper explores the "less is more" paradigm by addressing the challenge of designing accurate yet efficient Small Language Models (SLMs) for resource constrained devices.
arXiv Detail & Related papers (2024-02-26T18:59:03Z)
Elastic Entangled Pair and Qubit Resource Management in Quantum Cloud Computing [73.7522199491117]
Quantum cloud computing (QCC) offers a promising approach to efficiently provide quantum computing resources. The fluctuations in user demand and quantum circuit requirements are challenging for efficient resource provisioning. We propose a resource allocation model to provision quantum computing and networking resources.
arXiv Detail & Related papers (2023-07-25T00:38:46Z)
Relay Mining: Incentivizing Full Non-Validating Nodes Servicing All RPC Types [0.0]
Relay Mining estimates and proves the volume of Remote Procedure Calls (RPCs) made from a client to a server. We leverage digital signatures, commit-and-reveal schemes, and Sparse Merkle Sum Tries (SMSTs) to prove the amount of work done. A native cryptocurrency on a distributed ledger is used to rate limit applications and disincentivize over-usage.
arXiv Detail & Related papers (2023-05-18T03:23:41Z)
Deep Recurrent Learning Through Long Short Term Memory and TOPSIS [0.0]
Cloud computing's cheap, easy and quick management promise pushes business-owners for a transition from monolithic to a data-center/cloud based ERP. Since cloud-ERP development involves a cyclic process, namely planning, implementing, testing and upgrading, its adoption is realized as a deep recurrent neural network problem. Our theoretical model is validated over a reference model by articulating key players, services, architecture, functionalities.
arXiv Detail & Related papers (2022-12-30T10:35:25Z)
Evaluation of a blockchain-enabled resource management mechanism for NGNs [0.0]
This paper examines the use of blockchain technology for resource management and negotiation among Network Providers (NPs) The implementation of the resource management mechanism is described in a Smart Contract (SC) and the testbeds use the Raft and the IBFT consensus mechanisms respectively.
arXiv Detail & Related papers (2022-11-01T13:40:26Z)
KAIROS: Building Cost-Efficient Machine Learning Inference Systems with Heterogeneous Cloud Resources [10.462798429064277]
KAIROS is a novel runtime framework that maximizes the query throughput while meeting target and a cost budget. Our evaluation using industry-grade deep learning (DL) models shows that KAIROS yields up to 2X the throughput of an optimal homogeneous solution.
arXiv Detail & Related papers (2022-10-12T03:06:51Z)
AI in 6G: Energy-Efficient Distributed Machine Learning for Multilayer Heterogeneous Networks [7.318997639507269]
We propose a novel layer-based HetNet architecture which distributes tasks associated with different machine learning approaches across network layers and entities. Such a HetNet boasts multiple access schemes as well as device-to-device (D2D) communications to enhance energy efficiency.
arXiv Detail & Related papers (2022-06-04T22:03:19Z)
Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in Edge Industrial IoT [106.83952081124195]
Reinforcement learning (RL) has been widely investigated and shown to be a promising solution for decision-making and optimal control processes. We propose an adaptive ADMM (asI-ADMM) algorithm and apply it to decentralized RL with edge-computing-empowered IIoT networks. Experiment results show that our proposed algorithms outperform the state of the art in terms of communication costs and scalability, and can well adapt to complex IoT environments.
arXiv Detail & Related papers (2021-06-30T16:49:07Z)
Regulation conform DLT-operable payment adapter based on trustless - justified trust combined generalized state channels [77.34726150561087]
Economy of Things (EoT) will be based on software agents running on peer-to-peer trustless networks. We give an overview of current solutions that differ in their fundamental values and technological possibilities. We propose to combine the strengths of the crypto based, decentralized trustless elements with established and well regulated means of payment.
arXiv Detail & Related papers (2020-07-03T10:45:55Z)
Demand-Side Scheduling Based on Multi-Agent Deep Actor-Critic Learning for Smart Grids [56.35173057183362]
We consider the problem of demand-side energy management, where each household is equipped with a smart meter that is able to schedule home appliances online. The goal is to minimize the overall cost under a real-time pricing scheme. We propose the formulation of a smart grid environment as a Markov game.
arXiv Detail & Related papers (2020-05-05T07:32:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.