ECCENTRIC: Edge-Cloud Collaboration Framework for Distributed Inference Using Knowledge Adaptation
- URL: http://arxiv.org/abs/2511.11719v1
- Date: Wed, 12 Nov 2025 22:43:28 GMT
- Title: ECCENTRIC: Edge-Cloud Collaboration Framework for Distributed Inference Using Knowledge Adaptation
- Authors: Mohammad Mahdi Kamani, Zhongwei Cheng, Lin Chen,
- Abstract summary: Cloud inference systems can achieve the best performance while the computation and communication cost is dramatically increasing.<n>We propose a novel framework, dubbed as Eccentric, that learns models with different levels of trade-offs between these conflicting objectives.
- Score: 7.659994546640296
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The massive growth in the utilization of edge AI has made the applications of machine learning models ubiquitous in different domains. Despite the computation and communication efficiency of these systems, due to limited computation resources on edge devices, relying on more computationally rich systems on the cloud side is inevitable in most cases. Cloud inference systems can achieve the best performance while the computation and communication cost is dramatically increasing by the expansion of a number of edge devices relying on these systems. Hence, there is a trade-off between the computation, communication, and performance of these systems. In this paper, we propose a novel framework, dubbed as Eccentric that learns models with different levels of trade-offs between these conflicting objectives. This framework, based on an adaptation of knowledge from the edge model to the cloud one, reduces the computation and communication costs of the system during inference while achieving the best performance possible. The Eccentric framework can be considered as a new form of compression method suited for edge-cloud inference systems to reduce both computation and communication costs. Empirical studies on classification and object detection tasks corroborate the efficacy of this framework.
Related papers
- Efficient Machine Unlearning via Influence Approximation [75.31015485113993]
Influence-based unlearning has emerged as a prominent approach to estimate the impact of individual training samples on model parameters without retraining.<n>This paper establishes a theoretical link between memorizing (incremental learning) and forgetting (unlearning)<n>We introduce the Influence Approximation Unlearning algorithm for efficient machine unlearning from the incremental perspective.
arXiv Detail & Related papers (2025-07-31T05:34:27Z) - Performance Measurements in the AI-Centric Computing Continuum Systems [5.815300670677979]
We review commonly used metrics in Distributed Computing Continuum (DCC) and Internet of Things environments.<n>We discuss emerging performance dimensions that address evolving computing needs, such as sustainability, energy efficiency, and system observability.
arXiv Detail & Related papers (2025-06-28T13:46:07Z) - Edge-First Language Model Inference: Models, Metrics, and Tradeoffs [0.7980273012483663]
This work examines the interplay between edge and cloud deployments, starting from detailed benchmarking of SLM capabilities on single edge devices.<n>We identify scenarios where edge inference offers comparable performance with lower costs, and others where cloud fallback becomes essential due to limits in scalability or model capacity.<n>Rather than proposing a one-size-fits-all solution, we present platform-level comparisons and design insights for building efficient, adaptive LM inference systems.
arXiv Detail & Related papers (2025-05-22T10:43:00Z) - Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Center-Sensitive Kernel Optimization for Efficient On-Device Incremental Learning [88.78080749909665]
Current on-device training methods just focus on efficient training without considering the catastrophic forgetting.<n>This paper proposes a simple but effective edge-friendly incremental learning framework.<n>Our method achieves average accuracy boost of 38.08% with even less memory and approximate computation.
arXiv Detail & Related papers (2024-06-13T05:49:29Z) - Scalable Federated Unlearning via Isolated and Coded Sharding [76.12847512410767]
Federated unlearning has emerged as a promising paradigm to erase the client-level data effect.
This paper proposes a scalable federated unlearning framework based on isolated sharding and coded computing.
arXiv Detail & Related papers (2024-01-29T08:41:45Z) - Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - Fault-Tolerant Collaborative Inference through the Edge-PRUNE Framework [4.984601297028258]
Collaborative inference is a vehicle for distributing computation load, reducing latency, and addressing privacy preservation in communications.
This paper presents the Edge-PRUNE distributed computing framework, built on a formally defined model of computation, which provides a flexible infrastructure for fault tolerant collaborative inference.
arXiv Detail & Related papers (2022-06-16T13:16:53Z) - Communication-Computation Trade-Off in Resource-Constrained Edge
Inference [5.635540684037595]
This article presents effective methods for edge inference at resource-constrained devices.
It focuses on device-edge co-inference, assisted by an edge computing server.
A three-step framework is proposed for the effective inference.
arXiv Detail & Related papers (2020-06-03T11:00:32Z) - Incentive Mechanism Design for Resource Sharing in Collaborative Edge
Learning [106.51930957941433]
In 5G and Beyond networks, Artificial Intelligence applications are expected to be increasingly ubiquitous.
This necessitates a paradigm shift from the current cloud-centric model training approach to the Edge Computing based collaborative learning scheme known as edge learning.
arXiv Detail & Related papers (2020-05-31T12:45:06Z) - AutoScale: Optimizing Energy Efficiency of End-to-End Edge Inference
under Stochastic Variance [11.093360539563657]
AutoScale is an adaptive and light-weight execution scaling engine built upon the custom-designed reinforcement learning algorithm.
This paper proposes AutoScale to enable accurate, energy-efficient deep learning inference at the edge.
arXiv Detail & Related papers (2020-05-06T00:30:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.