Related papers: Combining Cloud and Mobile Computing for Machine Learning

Combining Cloud and Mobile Computing for Machine Learning

URL: http://arxiv.org/abs/2402.04880v2
Date: Fri, 23 Feb 2024 22:17:22 GMT
Title: Combining Cloud and Mobile Computing for Machine Learning
Authors: Ruiqi Xu and Tianchi Zhang
Abstract summary: We consider model segmentation as a solution to improving the user experience. We show that the division not only reduces the wait time for users but can also be fine-tuned to optimize the workloads of the cloud.
Score: 2.595189746033637
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although the computing power of mobile devices is increasing, machine learning models are also growing in size. This trend creates problems for mobile devices due to limitations like their memory capacity and battery life. While many services, like ChatGPT and Midjourney, run all the inferences in the cloud, we believe a flexible and fine-grained task distribution is more desirable. In this work, we consider model segmentation as a solution to improving the user experience, dividing the computation between mobile devices and the cloud in a way that offloads the compute-heavy portion of the model while minimizing the data transfer required. We show that the division not only reduces the wait time for users but can also be fine-tuned to optimize the workloads of the cloud. To achieve that, we design a scheduler that collects information about network quality, client device capability, and job requirements, making decisions to achieve consistent performance across a range of devices while reducing the work the cloud needs to perform.

Related papers

Managing Bandwidth: The Key to Cloud-Assisted Autonomous Driving [73.55745551827229]
We argue that we can, and must, rely on the cloud for real-time control systems like self-driving cars. We identify an opportunity to offload parts of time-sensitive and latency-critical compute to the cloud.
arXiv Detail & Related papers (2024-10-21T17:32:36Z)
Efficient Asynchronous Federated Learning with Sparsification and Quantization [55.6801207905772]
Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data. FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training. We propose TEASQ-Fed to exploit edge devices to asynchronously participate in the training process by actively applying for tasks.
arXiv Detail & Related papers (2023-12-23T07:47:07Z)
ECLM: Efficient Edge-Cloud Collaborative Learning with Continuous Environment Adaptation [47.35179593006409]
We propose ECLM, an edge-cloud collaborative learning framework for rapid model adaptation for dynamic edge environments. We show that ECLM significantly improves model performance (e.g., 18.89% accuracy increase) and resource efficiency (e.g. 7.12x communication cost reduction) in adapting models to dynamic edge environments.
arXiv Detail & Related papers (2023-11-18T14:10:09Z)
Mobile-Cloud Inference for Collaborative Intelligence [3.04585143845864]
There is an increasing need for faster execution and lower energy consumption for deep learning model inference. Historically, the models run on mobile devices have been smaller and simpler in comparison to large state-of-the-art research models, which can only run on the cloud. Cloud-only inference has drawbacks such as increased network bandwidth consumption and higher latency. There is an alternative approach: shared mobile-cloud inference.
arXiv Detail & Related papers (2023-06-24T14:22:53Z)
Cloud-Device Collaborative Adaptation to Continual Changing Environments in the Real-world [20.547119604004774]
We propose a new learning paradigm of Cloud-Device Collaborative Continual Adaptation, which encourages collaboration between cloud and device. We also propose an Uncertainty-based Visual Prompt Adapted (U-VPA) teacher-student model to transfer the generalization capability of the large model on the cloud to the device model. Our proposed U-VPA teacher-student framework outperforms previous state-of-the-art test time adaptation and device-cloud collaboration methods.
arXiv Detail & Related papers (2022-12-02T05:02:36Z)
MetaNetwork: A Task-agnostic Network Parameters Generation Framework for Improving Device Model Generalization [65.02542875281233]
We propose a novel task-agnostic framework, named MetaNetwork, for generating adaptive device model parameters from cloud without on-device training. The MetaGenerator is designed to learn a mapping function from samples to model parameters, and it can generate and deliver the adaptive parameters to the device based on samples uploaded from the device to the cloud. The MetaStabilizer aims to reduce the oscillation of the MetaGenerator, accelerate the convergence and improve the model performance during both training and inference.
arXiv Detail & Related papers (2022-09-12T13:26:26Z)
On-Device Training Under 256KB Memory [62.95579393237751]
We propose an algorithm-system co-design framework to make on-device training possible with only 256KB of memory. Our framework is the first solution to enable tiny on-device training of convolutional neural networks under 256KB and 1MB Flash.
arXiv Detail & Related papers (2022-06-30T17:59:08Z)
Optimizing Neural Network for Computer Vision task in Edge Device [0.0]
We deploy a convolution neural network on the edge device itself. The computational expense for edge devices is reduced by reducing the floating-point precision of the parameters in the model. This makes an edge device to predict from the neural network all by itself.
arXiv Detail & Related papers (2021-10-02T12:25:18Z)
Device-Cloud Collaborative Learning for Recommendation [50.01289274123047]
We propose a novel MetaPatch learning approach on the device side to efficiently achieve "thousands of people with thousands of models" given a centralized cloud model. With billions of updated personalized device models, we propose a "model-over-models" distillation algorithm, namely MoMoDistill, to update the centralized cloud model.
arXiv Detail & Related papers (2021-04-14T05:06:59Z)
Shared Mobile-Cloud Inference for Collaborative Intelligence [35.103437828235826]
We present a shared mobile-cloud inference approach for neural model inference. The strategy can improve inference latency, energy consumption, and network bandwidth usage. Further performance gain can be achieved by compressing the feature tensor before its transmission.
arXiv Detail & Related papers (2020-02-01T07:12:01Z)
Runtime Deep Model Multiplexing for Reduced Latency and Energy Consumption Inference [6.896677899938492]
We propose a learning algorithm to design a light-weight neural multiplexer that calls the model that will consume the minimum compute resources for a successful inference. Mobile devices can use the proposed algorithm to offload the hard inputs to the cloud while inferring the easy ones locally.
arXiv Detail & Related papers (2020-01-14T23:49:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.