Online Training of Large Language Models: Learn while chatting
- URL: http://arxiv.org/abs/2403.04790v1
- Date: Mon, 4 Mar 2024 10:00:55 GMT
- Title: Online Training of Large Language Models: Learn while chatting
- Authors: Juhao Liang, Ziwei Wang, Zhuoheng Ma, Jianquan Li, Zhiyi Zhang,
Xiangbo Wu and Benyou Wang
- Abstract summary: This paper introduces a novel interaction paradigm-'Online Training using External Interactions'-that merges the benefits of persistent, real-time model updates with the flexibility for individual customization.
- Score: 23.995637621755083
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models(LLMs) have dramatically revolutionized the field of
Natural Language Processing(NLP), offering remarkable capabilities that have
garnered widespread usage. However, existing interaction paradigms between LLMs
and users are constrained by either inflexibility, limitations in
customization, or a lack of persistent learning. This inflexibility is
particularly evident as users, especially those without programming skills,
have restricted avenues to enhance or personalize the model. Existing
frameworks further complicate the model training and deployment process due to
their computational inefficiencies and lack of user-friendly interfaces. To
overcome these challenges, this paper introduces a novel interaction
paradigm-'Online Training using External Interactions'-that merges the benefits
of persistent, real-time model updates with the flexibility for individual
customization through external interactions such as AI agents or online/offline
knowledge bases.
Related papers
- Adaptive Self-Supervised Learning Strategies for Dynamic On-Device LLM Personalization [3.1944843830667766]
Large language models (LLMs) have revolutionized how we interact with technology, but their personalization to individual user preferences remains a significant challenge.
We present Adaptive Self-Supervised Learning Strategies (ASLS), which utilize self-supervised learning techniques to personalize LLMs dynamically.
arXiv Detail & Related papers (2024-09-25T14:35:06Z) - Enabling Real-Time Conversations with Minimal Training Costs [61.80370154101649]
This paper presents a new duplex decoding approach that enhances large language models with duplex ability, requiring minimal training.
Experimental results indicate that our proposed method significantly enhances the naturalness and human-likeness of user-AI interactions with minimal training costs.
arXiv Detail & Related papers (2024-09-18T06:27:26Z) - Constraining Participation: Affordances of Feedback Features in Interfaces to Large Language Models [49.74265453289855]
Large language models (LLMs) are now accessible to anyone with a computer, a web browser, and an internet connection via browser-based interfaces.
This paper examines the affordances of interactive feedback features in ChatGPT's interface, analysing how they shape user input and participation in iteration.
arXiv Detail & Related papers (2024-08-27T13:50:37Z) - MoExtend: Tuning New Experts for Modality and Task Extension [61.29100693866109]
MoExtend is an effective framework designed to streamline the modality adaptation and extension of Mixture-of-Experts (MoE) models.
MoExtend seamlessly integrates new experts into pre-trained MoE models, endowing them with novel knowledge without the need to tune pretrained models.
arXiv Detail & Related papers (2024-08-07T02:28:37Z) - LEGENT: Open Platform for Embodied Agents [60.71847900126832]
We introduce LEGENT, an open, scalable platform for developing embodied agents using Large Language Models (LLMs) and Large Multimodal Models (LMMs)
LEGENT offers a rich, interactive 3D environment with communicable and actionable agents, paired with a user-friendly interface.
In experiments, an embryonic vision-language-action model trained on LEGENT-generated data surpasses GPT-4V in embodied tasks.
arXiv Detail & Related papers (2024-04-28T16:50:12Z) - Prompt-to-OS (P2OS): Revolutionizing Operating Systems and
Human-Computer Interaction with Integrated AI Generative Models [10.892991111926573]
We present a paradigm for human-computer interaction that revolutionizes the traditional notion of an operating system.
Within this innovative framework, user requests issued to the machine are handled by an interconnected ecosystem of generative AI models.
This visionary concept raises significant challenges, including privacy, security, trustability, and the ethical use of generative models.
arXiv Detail & Related papers (2023-10-07T17:16:34Z) - In-context Interference in Chat-based Large Language Models [8.197259049834038]
Large language models (LLMs) have had a huge impact on society due to their impressive capabilities and vast knowledge of the world.
Various applications and tools have been created that allow users to interact with these models in a black-box scenario.
This paper shows how the model can suffer from interference between information that continually flows in the context, causing it to forget previously learned knowledge.
arXiv Detail & Related papers (2023-09-22T09:18:55Z) - When Large Language Models Meet Personalization: Perspectives of
Challenges and Opportunities [60.5609416496429]
The capability of large language models has been dramatically improved.
Such a major leap-forward in general AI capacity will change the pattern of how personalization is conducted.
By leveraging large language models as general-purpose interface, personalization systems may compile user requests into plans.
arXiv Detail & Related papers (2023-07-31T02:48:56Z) - Sparsity-aware neural user behavior modeling in online interaction
platforms [2.4036844268502766]
We develop generalizable neural representation learning frameworks for user behavior modeling.
Our problem settings span transductive and inductive learning scenarios.
We leverage different facets of information reflecting user behavior to enable personalized inference at scale.
arXiv Detail & Related papers (2022-02-28T00:27:11Z) - Learning Adaptive Language Interfaces through Decomposition [89.21937539950966]
We introduce a neural semantic parsing system that learns new high-level abstractions through decomposition.
Users interactively teach the system by breaking down high-level utterances describing novel behavior into low-level steps.
arXiv Detail & Related papers (2020-10-11T08:27:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.