LLMs as On-demand Customizable Service
- URL: http://arxiv.org/abs/2401.16577v1
- Date: Mon, 29 Jan 2024 21:24:10 GMT
- Title: LLMs as On-demand Customizable Service
- Authors: Souvika Sarkar, Mohammad Fakhruddin Babar, Monowar Hasan, Shubhra
Kanti Karmaker (Santu)
- Abstract summary: We introduce a concept of hierarchical, distributed Large Language Models (LLMs)
By introducing a "layered" approach, the proposed architecture enables on-demand accessibility to LLMs as a customizable service.
We envision that the concept of hierarchical LLM will empower extensive, crowd-sourced user bases to harness the capabilities of LLMs.
- Score: 8.440060524215378
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have demonstrated remarkable language
understanding and generation capabilities. However, training, deploying, and
accessing these models pose notable challenges, including resource-intensive
demands, extended training durations, and scalability issues. To address these
issues, we introduce a concept of hierarchical, distributed LLM architecture
that aims at enhancing the accessibility and deployability of LLMs across
heterogeneous computing platforms, including general-purpose computers (e.g.,
laptops) and IoT-style devices (e.g., embedded systems). By introducing a
"layered" approach, the proposed architecture enables on-demand accessibility
to LLMs as a customizable service. This approach also ensures optimal
trade-offs between the available computational resources and the user's
application needs. We envision that the concept of hierarchical LLM will
empower extensive, crowd-sourced user bases to harness the capabilities of
LLMs, thereby fostering advancements in AI technology in general.
Related papers
- eFedLLM: Efficient LLM Inference Based on Federated Learning [1.6179784294541053]
Large Language Models (LLMs) herald a transformative era in artificial intelligence (AI)
This paper introduces an effective approach that enhances the operational efficiency and affordability of LLM inference.
arXiv Detail & Related papers (2024-11-24T22:50:02Z) - A Layered Architecture for Developing and Enhancing Capabilities in Large Language Model-based Software Systems [18.615283725693494]
This paper introduces a layered architecture that organizes Large Language Models (LLMs) software system development into distinct layers.
By aligning capabilities with these layers, the framework encourages the systematic implementation of capabilities in effective and efficient ways.
arXiv Detail & Related papers (2024-11-19T09:18:20Z) - Large Language Models for Base Station Siting: Intelligent Deployment based on Prompt or Agent [62.16747639440893]
Large language models (LLMs) and their associated technologies advance, particularly in the realms of prompt engineering and agent engineering.
This approach entails the strategic use of well-crafted prompts to infuse human experience and knowledge into these sophisticated LLMs.
This integration represents the future paradigm of artificial intelligence (AI) as a service and AI for more ease.
arXiv Detail & Related papers (2024-08-07T08:43:32Z) - A General-Purpose Device for Interaction with LLMs [3.052172365469752]
This paper investigates integrating large language models (LLMs) with advanced hardware.
We focus on developing a general-purpose device designed for enhanced interaction with LLMs.
arXiv Detail & Related papers (2024-08-02T23:43:29Z) - SoupLM: Model Integration in Large Language and Multi-Modal Models [51.12227693121004]
Training large language models (LLMs) requires significant computing resources.
Existing publicly available LLMs are typically pre-trained on diverse, privately curated datasets spanning various tasks.
arXiv Detail & Related papers (2024-07-11T05:38:15Z) - Efficient Prompting for LLM-based Generative Internet of Things [88.84327500311464]
Large language models (LLMs) have demonstrated remarkable capacities on various tasks, and integrating the capacities of LLMs into the Internet of Things (IoT) applications has drawn much research attention recently.
Due to security concerns, many institutions avoid accessing state-of-the-art commercial LLM services, requiring the deployment and utilization of open-source LLMs in a local network setting.
We propose a LLM-based Generative IoT (GIoT) system deployed in the local network setting in this study.
arXiv Detail & Related papers (2024-06-14T19:24:00Z) - Knowledge Fusion of Large Language Models [73.28202188100646]
This paper introduces the notion of knowledge fusion for large language models (LLMs)
We externalize their collective knowledge and unique strengths, thereby elevating the capabilities of the target model beyond those of any individual source LLM.
Our findings confirm that the fusion of LLMs can improve the performance of the target model across a range of capabilities such as reasoning, commonsense, and code generation.
arXiv Detail & Related papers (2024-01-19T05:02:46Z) - Small LLMs Are Weak Tool Learners: A Multi-LLM Agent [73.54562551341454]
Large Language Model (LLM) agents significantly extend the capabilities of standalone LLMs.
We propose a novel approach that decomposes the aforementioned capabilities into a planner, caller, and summarizer.
This modular framework facilitates individual updates and the potential use of smaller LLMs for building each capability.
arXiv Detail & Related papers (2024-01-14T16:17:07Z) - Federated Fine-Tuning of LLMs on the Very Edge: The Good, the Bad, the Ugly [62.473245910234304]
This paper takes a hardware-centric approach to explore how Large Language Models can be brought to modern edge computing systems.
We provide a micro-level hardware benchmark, compare the model FLOP utilization to a state-of-the-art data center GPU, and study the network utilization in realistic conditions.
arXiv Detail & Related papers (2023-10-04T20:27:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.