CrowdHMTware: A Cross-level Co-adaptation Middleware for Context-aware Mobile DL Deployment
- URL: http://arxiv.org/abs/2503.04183v1
- Date: Thu, 06 Mar 2025 07:52:20 GMT
- Title: CrowdHMTware: A Cross-level Co-adaptation Middleware for Context-aware Mobile DL Deployment
- Authors: Sicong Liu, Bin Guo, Shiyan Luo, Yuzhan Wang, Hao Luo, Cheng Fang, Yuan Xu, Ke Ma, Yao Li, Zhiwen Yu,
- Abstract summary: CrowdHMTware is a context-adaptive deep learning (DL) model deployment for heterogeneous mobile devices.<n>It establishes an automated adaptation loop between cross-level functional components, i.e. elastic inference, scalable offloading, and model-adaptive engine.<n>It can effectively scale DL model, offloading, and engine actions across diverse platforms and tasks.
- Score: 19.229115339238803
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There are many deep learning (DL) powered mobile and wearable applications today continuously and unobtrusively sensing the ambient surroundings to enhance all aspects of human lives.To enable robust and private mobile sensing, DL models are often deployed locally on resource-constrained mobile devices using techniques such as model compression or offloading.However, existing methods, either front-end algorithm level (i.e. DL model compression/partitioning) or back-end scheduling level (i.e. operator/resource scheduling), cannot be locally online because they require offline retraining to ensure accuracy or rely on manually pre-defined strategies, struggle with dynamic adaptability.The primary challenge lies in feeding back runtime performance from the back-end level to the front-end level optimization decision. Moreover, the adaptive mobile DL model porting middleware with cross-level co-adaptation is less explored, particularly in mobile environments with diversity and dynamics. In response, we introduce CrowdHMTware, a dynamic context-adaptive DL model deployment middleware for heterogeneous mobile devices. It establishes an automated adaptation loop between cross-level functional components, i.e. elastic inference, scalable offloading, and model-adaptive engine, enhancing scalability and adaptability. Experiments with four typical tasks across 15 platforms and a real-world case study demonstrate that CrowdHMTware can effectively scale DL model, offloading, and engine actions across diverse platforms and tasks. It hides run-time system issues from developers, reducing the required developer expertise.
Related papers
- Dynamic Allocation Hypernetwork with Adaptive Model Recalibration for Federated Continual Learning [49.508844889242425]
We propose a novel server-side FCL pattern in medical domain, Dynamic Allocation Hypernetwork with adaptive model recalibration (FedDAH)
FedDAH is designed to facilitate collaborative learning under the distinct and dynamic task streams across clients.
For the biased optimization, we introduce a novel adaptive model recalibration (AMR) to incorporate the candidate changes of historical models into current server updates.
arXiv Detail & Related papers (2025-03-25T00:17:47Z) - Dynamic Allocation Hypernetwork with Adaptive Model Recalibration for FCL [49.508844889242425]
We propose a novel server-side FCL pattern in medical domain, Dynamic Allocation Hypernetwork with adaptive model recalibration (textbfFedDAH)
For the biased optimization, we introduce a novel adaptive model recalibration (AMR) to incorporate the candidate changes of historical models into current server updates.
Experiments on the AMOS dataset demonstrate the superiority of our FedDAH to other FCL methods on sites with different task streams.
arXiv Detail & Related papers (2025-03-23T13:12:56Z) - AutoHete: An Automatic and Efficient Heterogeneous Training System for LLMs [68.99086112477565]
Transformer-based large language models (LLMs) have demonstrated exceptional capabilities in sequence modeling and text generation.<n>Existing heterogeneous training methods significantly expand the scale of trainable models but introduce substantial communication overheads and CPU workloads.<n>We propose AutoHete, an automatic and efficient heterogeneous training system compatible with both single- GPU and multi- GPU environments.
arXiv Detail & Related papers (2025-02-27T14:46:22Z) - AdaScale: Dynamic Context-aware DNN Scaling via Automated Adaptation Loop on Mobile Devices [16.5444553304756]
We introduce AdaScale, an elastic inference framework that automates the adaptation of deep models to dynamic contexts.<n>AdaScale significantly enhances accuracy by 5.09%, reduces training overhead by 66.89%, speeds up inference latency by 1.51 to 6.2 times, and lowers energy costs by 4.69 times.
arXiv Detail & Related papers (2024-12-01T08:33:56Z) - Multi-Stream Cellular Test-Time Adaptation of Real-Time Models Evolving in Dynamic Environments [53.79708667153109]
Smart objects, notably autonomous vehicles, face challenges in critical local computations due to limited resources.
We propose a novel Multi-Stream Cellular Test-Time Adaptation setup where models adapt on the fly to a dynamic environment divided into cells.
We validate our methodology in the context of autonomous vehicles navigating across cells defined based on location and weather conditions.
arXiv Detail & Related papers (2024-04-27T15:00:57Z) - MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications.
Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders.
We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z) - Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters [65.15700861265432]
We present a parameter-efficient continual learning framework to alleviate long-term forgetting in incremental learning with vision-language models.
Our approach involves the dynamic expansion of a pre-trained CLIP model, through the integration of Mixture-of-Experts (MoE) adapters.
To preserve the zero-shot recognition capability of vision-language models, we introduce a Distribution Discriminative Auto-Selector.
arXiv Detail & Related papers (2024-03-18T08:00:23Z) - Enabling Resource-efficient AIoT System with Cross-level Optimization: A
survey [20.360136850102833]
This survey aims to provide a broader optimization space for more free resource-performance tradeoffs.
By consolidating problems and techniques scattered over diverse levels, we aim to help readers understand their connections and stimulate further discussions.
arXiv Detail & Related papers (2023-09-27T08:04:24Z) - FLuID: Mitigating Stragglers in Federated Learning using Invariant
Dropout [1.8262547855491458]
Federated Learning allows machine learning models to train locally on individual mobile devices, synchronizing model updates via a shared server.
As a result, straggler devices with lower performance often dictate the overall training time in FL.
We introduce Invariant Dropout, a method that extracts a sub-model based on the weight update threshold.
We develop an adaptive training framework, Federated Learning using Invariant Dropout.
arXiv Detail & Related papers (2023-07-05T19:53:38Z) - Mobiprox: Supporting Dynamic Approximate Computing on Mobiles [9.012472705158592]
We present Mobiprox, a framework enabling mobile deep learning with flexible precision.
Mobiprox implements tunable approximations of tensor operations and enables runtime-adaptable approximation of individual network layers.
We demonstrate that it can save up to 15% system-wide energy with a minimal impact on the inference accuracy.
arXiv Detail & Related papers (2023-03-16T21:40:23Z) - Consolidating Kinematic Models to Promote Coordinated Mobile
Manipulations [96.03270112422514]
We construct a Virtual Kinematic Chain (VKC) that consolidates the kinematics of the mobile base, the arm, and the object to be manipulated in mobile manipulations.
A mobile manipulation task is represented by altering the state of the constructed VKC, which can be converted to a motion planning problem.
arXiv Detail & Related papers (2021-08-03T02:59:41Z) - OODIn: An Optimised On-Device Inference Framework for Heterogeneous
Mobile Devices [5.522962791793502]
OODIn is a framework for the optimised deployment of deep learning apps across heterogeneous mobile devices.
It counteracts the variability in device resources and DL models by means of a highly parametrised multi-layer design.
It delivers up to 4.3x and 3.5x performance gain over highly optimised platform- and model-aware designs.
arXiv Detail & Related papers (2021-06-08T22:38:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.