PhoneLM:an Efficient and Capable Small Language Model Family through Principled Pre-training
- URL: http://arxiv.org/abs/2411.05046v1
- Date: Thu, 07 Nov 2024 02:19:00 GMT
- Title: PhoneLM:an Efficient and Capable Small Language Model Family through Principled Pre-training
- Authors: Rongjie Yi, Xiang Li, Weikai Xie, Zhenyan Lu, Chenghua Wang, Ao Zhou, Shangguang Wang, Xiwen Zhang, Mengwei Xu,
- Abstract summary: Existing small language models (SLM) for on-device deployment don't consider device hardware characteristics.
This work presents a simple yet effective principle for SLM design: architecture searching for (near-)optimal runtime efficiency before pre-training.
We develop PhoneLM family (currently with 0.5B and 1.5B versions), that acheive the state-of-the-art capability-efficiency tradeoff among those with similar parameter size.
- Score: 6.827011856777674
- License:
- Abstract: The interest in developing small language models (SLM) for on-device deployment is fast growing. However, the existing SLM design hardly considers the device hardware characteristics. Instead, this work presents a simple yet effective principle for SLM design: architecture searching for (near-)optimal runtime efficiency before pre-training. Guided by this principle, we develop PhoneLM SLM family (currently with 0.5B and 1.5B versions), that acheive the state-of-the-art capability-efficiency tradeoff among those with similar parameter size. We fully open-source the code, weights, and training datasets of PhoneLM for reproducibility and transparency, including both base and instructed versions. We also release a finetuned version of PhoneLM capable of accurate Android Intent invocation, and an end-to-end Android demo. All materials are available at https://github.com/UbiquitousLearning/PhoneLM.
Related papers
- SlimLM: An Efficient Small Language Model for On-Device Document Assistance [60.971107009492606]
We present SlimLM, a series of SLMs optimized for document assistance tasks on mobile devices.
SlimLM is pre-trained on SlimPajama-627B and fine-tuned on DocAssist.
We evaluate SlimLM against existing SLMs, showing comparable or superior performance.
arXiv Detail & Related papers (2024-11-15T04:44:34Z) - ULLME: A Unified Framework for Large Language Model Embeddings with Generation-Augmented Learning [72.90823351726374]
We introduce the Unified framework for Large Language Model Embedding (ULLME), a flexible, plug-and-play implementation that enables bidirectional attention across various LLMs.
We also propose Generation-augmented Representation Learning (GRL), a novel fine-tuning method to boost LLMs for text embedding tasks.
To showcase our framework's flexibility and effectiveness, we release three pre-trained models from ULLME with different backbone architectures.
arXiv Detail & Related papers (2024-08-06T18:53:54Z) - ST-LLM: Large Language Models Are Effective Temporal Learners [58.79456373423189]
Large Language Models (LLMs) have showcased impressive capabilities in text comprehension and generation.
How to effectively encode and understand videos in video-based dialogue systems remains to be solved.
We propose ST-LLM, an effective video-LLM baseline with spatial-temporal sequence modeling inside LLM.
arXiv Detail & Related papers (2024-03-30T10:11:26Z) - MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT [87.4910758026772]
"Bigger the better" has been the predominant trend in recent Large Language Models (LLMs) development.
This paper explores the "less is more" paradigm by addressing the challenge of designing accurate yet efficient Small Language Models (SLMs) for resource constrained devices.
arXiv Detail & Related papers (2024-02-26T18:59:03Z) - MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile
Devices [73.46317110474064]
MobileVLM is a competent multimodal vision language model (MMVLM) targeted to run on mobile devices.
It comprises a set of language models at the scale of 1.4B and 2.7B parameters, trained from scratch, a multimodal vision model that is pre-trained in the CLIP fashion.
arXiv Detail & Related papers (2023-12-28T08:21:24Z) - Confidant: Customizing Transformer-based LLMs via Collaborative Edge
Training [18.526329975259483]
Transformer-based large language models (LLMs) have demonstrated impressive capabilities in a variety of natural language processing (NLP) tasks.
It is challenging to deploy and fine-tune LLMs on mobile edge devices with limited computing, memory, and energy budgets.
We propose Confidant, a multi-backend collaborative training framework for customizing state-of-the-art LLMs on commodity mobile devices.
arXiv Detail & Related papers (2023-11-22T13:20:59Z) - InfMLLM: A Unified Framework for Visual-Language Tasks [44.29407348046122]
multimodal large language models (MLLMs) have attracted growing interest.
This work delves into enabling LLMs to tackle more vision-language-related tasks.
InfMLLM achieves either state-of-the-art (SOTA) performance or performance comparable to recent MLLMs.
arXiv Detail & Related papers (2023-11-12T09:58:16Z) - AutoDroid: LLM-powered Task Automation in Android [32.241570727243534]
We introduce AutoDroid, a mobile task automation system capable of handling arbitrary tasks on any Android application without manual efforts.
The main components include a functionality-aware UI representation method that bridges the UI with the LLM.
We evaluate its performance on a new benchmark for memory-augmented Android task automation with 158 common tasks.
arXiv Detail & Related papers (2023-08-29T13:02:30Z) - Generative Multimodal Entity Linking [24.322540112710918]
Multimodal Entity Linking (MEL) is the task of mapping mentions with multimodal contexts to referent entities from a knowledge base.
Existing MEL methods mainly focus on designing complex multimodal interaction mechanisms and require fine-tuning all model parameters.
We propose GEMEL, a Generative Multimodal Entity Linking framework based on Large Language Models (LLMs)
Our framework is compatible with any off-the-shelf language model, paving the way towards an efficient and general solution.
arXiv Detail & Related papers (2023-06-22T07:57:19Z) - LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation.
We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset.
Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.