ACE: Towards Application-Centric Edge-Cloud Collaborative Intelligence
- URL: http://arxiv.org/abs/2203.13061v1
- Date: Thu, 24 Mar 2022 13:12:33 GMT
- Title: ACE: Towards Application-Centric Edge-Cloud Collaborative Intelligence
- Authors: Luhui Wang, Cong Zhao, Shusen Yang, Xinyu Yang, Julie McCann
- Abstract summary: Intelligent applications based on machine learning are impacting many parts of our lives.
Current implementations running in the Cloud are unable to satisfy all these constraints.
The Edge-Cloud Collaborative Intelligence paradigm has become a popular approach to address such issues.
- Score: 14.379967483688834
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Intelligent applications based on machine learning are impacting many parts
of our lives. They are required to operate under rigorous practical constraints
in terms of service latency, network bandwidth overheads, and also privacy. Yet
current implementations running in the Cloud are unable to satisfy all these
constraints. The Edge-Cloud Collaborative Intelligence (ECCI) paradigm has
become a popular approach to address such issues, and rapidly increasing
applications are developed and deployed. However, these prototypical
implementations are developer-dependent and scenario-specific without
generality, which cannot be efficiently applied in large-scale or to general
ECC scenarios in practice, due to the lack of supports for infrastructure
management, edge-cloud collaborative service, complex intelligence workload,
and efficient performance optimization. In this article, we systematically
design and construct the first unified platform, ACE, that handles
ever-increasing edge and cloud resources, user-transparent services, and
proliferating intelligence workloads with increasing scale and complexity, to
facilitate cost-efficient and high-performing ECCI application development and
deployment. For verification, we explicitly present the construction process of
an ACE-based intelligent video query application, and demonstrate how to
achieve customizable performance optimization efficiently. Based on our initial
experience, we discuss both the limitations and vision of ACE to shed light on
promising issues to elaborate in the approaching ECCI ecosystem.
Related papers
- Radon: a Programming Model and Platform for Computing Continuum Systems [41.94295877935867]
Radon is a flexible programming model and platform designed for the edge-to-cloud continuum.
The Radon runtime, based on WebAssembly (WASM), enables language- and deployment-independent execution.
We present a prototype implementation of Radon and evaluate its effectiveness through a distributed key-value store case study.
arXiv Detail & Related papers (2025-03-19T13:38:25Z) - A Comprehensive Experimentation Framework for Energy-Efficient Design of Cloud-Native Applications [0.0]
We present a framework that enables developers to measure energy efficiency across all relevant layers of a cloud-based application.
Our framework integrates a suite of service quality and sustainability metrics, providing compatibility with any-based application.
arXiv Detail & Related papers (2025-03-11T17:34:37Z) - Intelligent Mobile AI-Generated Content Services via Interactive Prompt Engineering and Dynamic Service Provisioning [55.641299901038316]
AI-generated content can organize collaborative Mobile AIGC Service Providers (MASPs) at network edges to provide ubiquitous and customized content for resource-constrained users.
Such a paradigm faces two significant challenges: 1) raw prompts often lead to poor generation quality due to users' lack of experience with specific AIGC models, and 2) static service provisioning fails to efficiently utilize computational and communication resources.
We develop an interactive prompt engineering mechanism that leverages a Large Language Model (LLM) to generate customized prompt corpora and employs Inverse Reinforcement Learning (IRL) for policy imitation.
arXiv Detail & Related papers (2025-02-17T03:05:20Z) - A Hybrid Swarm Intelligence Approach for Optimizing Multimodal Large Language Models Deployment in Edge-Cloud-based Federated Learning Environments [10.72166883797356]
Federated Learning (FL), Multimodal Large Language Models (MLLMs), and edge-cloud computing enables distributed and real-time data processing.
We propose a novel hybrid framework wherein MLLMs are deployed on edge devices equipped with sufficient resources and battery life, while the majority of training occurs in the cloud.
Our experimental results show that the proposed method significantly improves system performance, achieving an accuracy of 92%, reducing communication cost by 30%, and enhancing client participation.
arXiv Detail & Related papers (2025-02-04T03:03:24Z) - Self-Organizing Interaction Spaces: A Framework for Engineering Pervasive Applications in Mobile and Distributed Environments [0.0]
This paper introduces Self-Organizing Interaction Spaces (SOIS), a novel framework for engineering pervasive applications.
SOIS leverages the dynamic and heterogeneous nature of mobile nodes, allowing them to form adaptive organizational structures.
Results highlight its potential to enhance efficiency and reduce reliance on traditional cloud models.
arXiv Detail & Related papers (2025-02-03T08:11:30Z) - Transforming the Hybrid Cloud for Emerging AI Workloads [81.15269563290326]
This white paper envisions transforming hybrid cloud systems to meet the growing complexity of AI workloads.
The proposed framework addresses critical challenges in energy efficiency, performance, and cost-effectiveness.
This joint initiative aims to establish hybrid clouds as secure, efficient, and sustainable platforms.
arXiv Detail & Related papers (2024-11-20T11:57:43Z) - Optimizing Airline Reservation Systems with Edge-Enabled Microservices: A Framework for Real-Time Data Processing and Enhanced User Responsiveness [1.03590082373586]
This paper outlines a conceptual framework for the implementation of edge computing in the airline industry.
As edge computing allows for certain activities such as seat inventory checks, booking processes and even confirmation to be done nearer to the user, thus lessening the overall response time and improving the performance of the system.
The framework value should include achieving the high performance of the system such as low latency, high throughput and higher user experience.
arXiv Detail & Related papers (2024-11-19T16:58:15Z) - Large Language Model as a Catalyst: A Paradigm Shift in Base Station Siting Optimization [62.16747639440893]
Large language models (LLMs) and their associated technologies advance, particularly in the realms of prompt engineering and agent engineering.
Our proposed framework incorporates retrieval-augmented generation (RAG) to enhance the system's ability to acquire domain-specific knowledge and generate solutions.
arXiv Detail & Related papers (2024-08-07T08:43:32Z) - Inference Optimization of Foundation Models on AI Accelerators [68.24450520773688]
Powerful foundation models, including large language models (LLMs), with Transformer architectures have ushered in a new era of Generative AI.
As the number of model parameters reaches to hundreds of billions, their deployment incurs prohibitive inference costs and high latency in real-world scenarios.
This tutorial offers a comprehensive discussion on complementary inference optimization techniques using AI accelerators.
arXiv Detail & Related papers (2024-07-12T09:24:34Z) - LAECIPS: Large Vision Model Assisted Adaptive Edge-Cloud Collaboration for IoT-based Perception System [24.84622024011103]
Edge-cloud collaboration with large-small model co-inference offers a promising approach to achieving high inference accuracy and low latency.
Existing edge-cloud collaboration methods are tightly coupled with the model architecture and cannot adapt to the dynamic data drifts in heterogeneous IoT environments.
In LAECIPS, both the large vision model on the cloud and the lightweight model on the edge are plug-and-play. We design an edge-cloud collaboration strategy based on hard input mining, optimized for both high accuracy and low latency.
arXiv Detail & Related papers (2024-04-16T12:12:06Z) - Machine Learning Insides OptVerse AI Solver: Design Principles and
Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver.
We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem.
We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z) - VEDLIoT -- Next generation accelerated AIoT systems and applications [4.964750143168832]
The VEDLIoT project aims to develop energy-efficient Deep Learning methodologies for distributed Artificial Intelligence of Things (AIoT) applications.
We propose a holistic approach that focuses on optimizing algorithms while addressing safety and security challenges inherent to AIoT systems.
arXiv Detail & Related papers (2023-05-09T12:35:00Z) - MCDS: AI Augmented Workflow Scheduling in Mobile Edge Cloud Computing
Systems [12.215537834860699]
Recently proposed scheduling methods leverage the low response times of edge computing platforms to optimize application Quality of Service (QoS)
We propose MCDS: Monte Carlo Learning using Deep Surrogate Models to efficiently schedule workflow applications in mobile edge-cloud computing systems.
arXiv Detail & Related papers (2021-12-14T10:00:01Z) - Auto-Split: A General Framework of Collaborative Edge-Cloud AI [49.750972428032355]
This paper describes the techniques and engineering practice behind Auto-Split, an edge-cloud collaborative prototype of Huawei Cloud.
To the best of our knowledge, there is no existing industry product that provides the capability of Deep Neural Network (DNN) splitting.
arXiv Detail & Related papers (2021-08-30T08:03:29Z) - Reproducible Performance Optimization of Complex Applications on the
Edge-to-Cloud Continuum [55.6313942302582]
We propose a methodology to support the optimization of real-life applications on the Edge-to-Cloud Continuum.
Our approach relies on a rigorous analysis of possible configurations in a controlled testbed environment to understand their behaviour.
Our methodology can be generalized to other applications in the Edge-to-Cloud Continuum.
arXiv Detail & Related papers (2021-08-04T07:35:14Z) - A Privacy-Preserving Distributed Architecture for
Deep-Learning-as-a-Service [68.84245063902908]
This paper introduces a novel distributed architecture for deep-learning-as-a-service.
It is able to preserve the user sensitive data while providing Cloud-based machine and deep learning services.
arXiv Detail & Related papers (2020-03-30T15:12:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.