A Performance Evaluation of a Quantized Large Language Model on Various
Smartphones
- URL: http://arxiv.org/abs/2312.12472v1
- Date: Tue, 19 Dec 2023 10:19:39 GMT
- Title: A Performance Evaluation of a Quantized Large Language Model on Various
Smartphones
- Authors: Tolga \c{C}\"opl\"u, Marc Loedi, Arto Bendiken, Mykhailo Makohin,
Joshua J. Bouw, Stephen Cobb (Haltia, Inc.)
- Abstract summary: This paper explores the feasibility and performance of on-device large language model (LLM) inference on various Apple iPhone models.
Leveraging existing literature on running multi-billion parameter LLMs on resource-limited devices, our study examines the thermal effects and interaction speeds of a high-performing LLM.
We present real-world performance results, providing insights into on-device inference capabilities.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper explores the feasibility and performance of on-device large
language model (LLM) inference on various Apple iPhone models. Amidst the rapid
evolution of generative AI, on-device LLMs offer solutions to privacy,
security, and connectivity challenges inherent in cloud-based models.
Leveraging existing literature on running multi-billion parameter LLMs on
resource-limited devices, our study examines the thermal effects and
interaction speeds of a high-performing LLM across different smartphone
generations. We present real-world performance results, providing insights into
on-device inference capabilities.
Related papers
- SlimLM: An Efficient Small Language Model for On-Device Document Assistance [60.971107009492606]
We present SlimLM, a series of SLMs optimized for document assistance tasks on mobile devices.
SlimLM is pre-trained on SlimPajama-627B and fine-tuned on DocAssist.
We evaluate SlimLM against existing SLMs, showing comparable or superior performance.
arXiv Detail & Related papers (2024-11-15T04:44:34Z) - A Survey of Small Language Models [104.80308007044634]
Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources.
We present a comprehensive survey on SLMs, focusing on their architectures, training techniques, and model compression techniques.
arXiv Detail & Related papers (2024-10-25T23:52:28Z) - RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks.
Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs.
In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z) - Large Language Model Performance Benchmarking on Mobile Platforms: A Thorough Evaluation [10.817783356090027]
Large language models (LLMs) increasingly integrate into every aspect of our work and daily lives.
There are growing concerns about user privacy, which push the trend toward local deployment of these models.
As a rapidly emerging application, we are concerned about their performance on commercial-off-the-shelf mobile devices.
arXiv Detail & Related papers (2024-10-04T17:14:59Z) - EMMA: Efficient Visual Alignment in Multi-Modal LLMs [56.03417732498859]
EMMA is a lightweight cross-modality module designed to efficiently fuse visual and textual encodings.
EMMA boosts performance across multiple tasks by up to 9.3% while significantly improving robustness against hallucinations.
arXiv Detail & Related papers (2024-10-02T23:00:31Z) - On-Device Language Models: A Comprehensive Review [26.759861320845467]
Review examines the challenges of deploying computationally expensive large language models on resource-constrained devices.
Paper investigates on-device language models, their efficient architectures, as well as state-of-the-art compression techniques.
Case studies of on-device language models from major mobile manufacturers demonstrate real-world applications and potential benefits.
arXiv Detail & Related papers (2024-08-26T03:33:36Z) - A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks [74.52259252807191]
Multimodal Large Language Models (MLLMs) address the complexities of real-world applications far beyond the capabilities of single-modality systems.
This paper systematically sorts out the applications of MLLM in multimodal tasks such as natural language, vision, and audio.
arXiv Detail & Related papers (2024-08-02T15:14:53Z) - Mobile Edge Intelligence for Large Language Models: A Contemporary Survey [32.22789677882933]
Mobile edge intelligence (MEI) provides AI capabilities within the edge of mobile networks with improved privacy and latency relative to cloud computing.
MEI sits between on-device AI and cloud-based AI, featuring wireless communications and more powerful computing resources than end devices.
This article provides a contemporary survey on harnessing MEI for LLMs.
arXiv Detail & Related papers (2024-07-09T13:47:05Z) - MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases [81.70591346986582]
We introduce MobileAIBench, a benchmarking framework for evaluating Large Language Models (LLMs) and Large Multimodal Models (LMMs) on mobile devices.
MobileAIBench assesses models across different sizes, quantization levels, and tasks, measuring latency and resource consumption on real devices.
arXiv Detail & Related papers (2024-06-12T22:58:12Z) - A Review of Multi-Modal Large Language and Vision Models [1.9685736810241874]
Large Language Models (LLMs) have emerged as a focal point of research and application.
Recently, LLMs have been extended into multi-modal large language models (MM-LLMs)
This paper provides an extensive review of the current state of those LLMs with multi-modal capabilities as well as the very recent MM-LLMs.
arXiv Detail & Related papers (2024-03-28T15:53:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.