Dynamic Pricing for On-Demand DNN Inference in the Edge-AI Market
- URL: http://arxiv.org/abs/2503.04521v1
- Date: Thu, 06 Mar 2025 15:08:31 GMT
- Title: Dynamic Pricing for On-Demand DNN Inference in the Edge-AI Market
- Authors: Songyuan Li, Jia Hu, Geyong Min, Haojun Huang, Jiwei Huang,
- Abstract summary: Auction-based Edge Inference Pricing Mechanism (AERIA) for revenue optimization.<n>We investigate the multi-exit device-edge synergistic inference scheme for on-demand DNN inference acceleration.<n>Our AERIA mechanism significantly outperforms several state-of-the-art approaches in revenue, demonstrating the efficacy of AERIA for on-demand inference in the Edge-AI market.
- Score: 25.367459316428242
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The convergence of edge computing and AI gives rise to Edge-AI, which enables the deployment of real-time AI applications and services at the network edge. One of the fundamental research issues in Edge-AI is edge inference acceleration, which aims to realize low-latency high-accuracy DNN inference services by leveraging the fine-grained offloading of partitioned inference tasks from end devices to edge servers. However, existing research has yet to adopt a practical Edge-AI market perspective, which would systematically explore the personalized inference needs of AI users (e.g., inference accuracy, latency, and task complexity), the revenue incentives for AI service providers that offer edge inference services, and multi-stakeholder governance within a market-oriented context. To bridge this gap, we propose an Auction-based Edge Inference Pricing Mechanism (AERIA) for revenue maximization to tackle the multi-dimensional optimization problem of DNN model partition, edge inference pricing, and resource allocation. We investigate the multi-exit device-edge synergistic inference scheme for on-demand DNN inference acceleration, and analyse the auction dynamics amongst the AI service providers, AI users and edge infrastructure provider. Owing to the strategic mechanism design via randomized consensus estimate and cost sharing techniques, the Edge-AI market attains several desirable properties, including competitiveness in revenue maximization, incentive compatibility, and envy-freeness, which are crucial to maintain the effectiveness, truthfulness, and fairness of our auction outcomes. The extensive simulation experiments based on four representative DNN inference workloads demonstrate that our AERIA mechanism significantly outperforms several state-of-the-art approaches in revenue maximization, demonstrating the efficacy of AERIA for on-demand DNN inference in the Edge-AI market.
Related papers
- Joint Resource Optimization, Computation Offloading and Resource Slicing for Multi-Edge Traffic-Cognitive Networks [0.0]
This paper investigates a multi-agent system where both the platform and ESs are self-interested entities.<n>We propose a novel Stackelberg game-based framework to model interactions between stakeholders and solve the optimization problem.<n>We further design a decentralized solution leveraging neural network optimization and a privacy-preserving information exchange protocol.
arXiv Detail & Related papers (2024-11-26T11:51:10Z) - Multi-Agent RL-Based Industrial AIGC Service Offloading over Wireless Edge Networks [19.518346220904732]
We propose a generative model-driven industrial AIGC collaborative edge learning framework.
This framework aims to facilitate efficient few-shot learning by leveraging realistic sample synthesis and edge-based optimization capabilities.
arXiv Detail & Related papers (2024-05-05T15:31:47Z) - A Learning-based Incentive Mechanism for Mobile AIGC Service in Decentralized Internet of Vehicles [49.86094523878003]
We propose a decentralized incentive mechanism for mobile AIGC service allocation.
We employ multi-agent deep reinforcement learning to find the balance between the supply of AIGC services on RSUs and user demand for services within the IoV context.
arXiv Detail & Related papers (2024-03-29T12:46:07Z) - Offloading and Quality Control for AI Generated Content Services in 6G Mobile Edge Computing Networks [18.723955271182007]
This paper proposes a joint optimization algorithm for offloading decisions, computation time, and diffusion steps of the diffusion models in the reverse diffusion stage.
Experimental results conclusively demonstrate that the proposed algorithm achieves superior joint optimization performance compared to the baselines.
arXiv Detail & Related papers (2023-12-11T08:36:27Z) - Green Edge AI: A Contemporary Survey [46.11332733210337]
The transformative power of AI is derived from the utilization of deep neural networks (DNNs)
Deep learning (DL) is increasingly being transitioned to wireless edge networks in proximity to end-user devices (EUDs)
Despite its potential, edge AI faces substantial challenges, mostly due to the dichotomy between the resource limitations of wireless edge networks and the resource-intensive nature of DL.
arXiv Detail & Related papers (2023-12-01T04:04:37Z) - Edge AI Inference in Heterogeneous Constrained Computing: Feasibility
and Opportunities [9.156192191794567]
The proliferation of AI inference accelerators showcases innovation but also underscores challenges.
This paper outlines the requirements and components of a framework that accommodates hardware diversity.
Next, we assess the impact of device heterogeneity on AI inference performance, identifying strategies to optimize outcomes without compromising service quality.
arXiv Detail & Related papers (2023-10-27T16:46:59Z) - LAMBO: Large AI Model Empowered Edge Intelligence [71.56135386994119]
Next-generation edge intelligence is anticipated to benefit various applications via offloading techniques.
Traditional offloading architectures face several issues, including heterogeneous constraints, partial perception, uncertain generalization, and lack of tractability.
We propose a Large AI Model-Based Offloading (LAMBO) framework with over one billion parameters for solving these problems.
arXiv Detail & Related papers (2023-08-29T07:25:42Z) - Semantic Information Marketing in The Metaverse: A Learning-Based
Contract Theory Framework [68.8725783112254]
We address the problem of designing incentive mechanisms by a virtual service provider (VSP) to hire sensing IoT devices to sell their sensing data.
Due to the limited bandwidth, we propose to use semantic extraction algorithms to reduce the delivered data by the sensing IoT devices.
We propose a novel iterative contract design and use a new variant of multi-agent reinforcement learning (MARL) to solve the modelled multi-dimensional contract problem.
arXiv Detail & Related papers (2023-02-22T15:52:37Z) - Online Learning under Budget and ROI Constraints via Weak Adaptivity [57.097119428915796]
Existing primal-dual algorithms for constrained online learning problems rely on two fundamental assumptions.
We show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers.
We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions.
arXiv Detail & Related papers (2023-02-02T16:30:33Z) - Enabling AI-Generated Content (AIGC) Services in Wireless Edge Networks [68.00382171900975]
In wireless edge networks, the transmission of incorrectly generated content may unnecessarily consume network resources.
We present the AIGC-as-a-service concept and discuss the challenges in deploying A at the edge networks.
We propose a deep reinforcement learning-enabled algorithm for optimal ASP selection.
arXiv Detail & Related papers (2023-01-09T09:30:23Z) - Edge Computing for Semantic Communication Enabled Metaverse: An
Incentive Mechanism Design [72.27143788103245]
SemCom and edge computing are disruptive solutions to address emerging requirements of huge data communication, bandwidth efficiency and low latency data processing in Metaverse.
Deep learning (DL)-based auction has recently proposed as an incentive mechanism that maximizes the revenue while holding important economic properties.
We present the design of the DL-based auction for edge resource allocation in SemCom-enabled Metaverse.
arXiv Detail & Related papers (2022-12-13T10:29:41Z) - GNN at the Edge: Cost-Efficient Graph Neural Network Processing over
Distributed Edge Servers [24.109721494781592]
Graph Neural Networks (GNNs) are still under exploration, presenting a stark disparity to its broad edge adoptions.
This paper studies the cost optimization for distributed GNN processing over a multi-tier heterogeneous edge network.
We show that our approach achieves superior performance over de facto baselines with more than 95.8% cost eduction in a fast convergence speed.
arXiv Detail & Related papers (2022-10-31T13:03:16Z) - Communication-Computation Trade-Off in Resource-Constrained Edge
Inference [5.635540684037595]
This article presents effective methods for edge inference at resource-constrained devices.
It focuses on device-edge co-inference, assisted by an edge computing server.
A three-step framework is proposed for the effective inference.
arXiv Detail & Related papers (2020-06-03T11:00:32Z) - Incentive Mechanism Design for Resource Sharing in Collaborative Edge
Learning [106.51930957941433]
In 5G and Beyond networks, Artificial Intelligence applications are expected to be increasingly ubiquitous.
This necessitates a paradigm shift from the current cloud-centric model training approach to the Edge Computing based collaborative learning scheme known as edge learning.
arXiv Detail & Related papers (2020-05-31T12:45:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.