Continual Learning Using Only Large Language Model Prompting
- URL: http://arxiv.org/abs/2412.15479v1
- Date: Fri, 20 Dec 2024 01:21:57 GMT
- Title: Continual Learning Using Only Large Language Model Prompting
- Authors: Jiabao Qiu, Zixuan Ke, Bing Liu,
- Abstract summary: We introduce CLOB, a novel continual learning paradigm wherein a large language model (LLM) is regarded as a black box.
We also propose a new CL technique, called CIS, based on incremental summarization that also overcomes the LLM's input length limit.
- Score: 13.987306383667518
- License:
- Abstract: We introduce CLOB, a novel continual learning (CL) paradigm wherein a large language model (LLM) is regarded as a black box. Learning is done incrementally via only verbal prompting. CLOB does not fine-tune any part of the LLM or add any trainable parameters to it. It is particularly suitable for LLMs that are accessible via APIs. We also propose a new CL technique, called CIS, based on incremental summarization that also overcomes the LLM's input length limit. Experiments show CIS outperforms baselines by a very large margin.
Related papers
- In-context Continual Learning Assisted by an External Continual Learner [19.382196203113836]
Existing continual learning (CL) methods rely on fine-tuning or adapting large language models (LLMs)
We introduce InCA, a novel approach that integrates an external continual learner (ECL) with ICL to enable scalable CL without CF.
arXiv Detail & Related papers (2024-12-20T04:44:41Z) - PathOCL: Path-Based Prompt Augmentation for OCL Generation with GPT-4 [10.564949684320727]
We introduce PathOCL, a novel path-based prompt augmentation technique designed to facilitate Object Constraint Language generation.
Our findings demonstrate that PathOCL, compared to augmenting the complete class model (UML-Augmentation), generates a higher number of valid and correct OCL constraints.
arXiv Detail & Related papers (2024-05-21T02:00:54Z) - Word Embeddings Revisited: Do LLMs Offer Something New? [2.822851601000061]
Learning meaningful word embeddings is key to training a robust language model.
The recent rise of Large Language Models (LLMs) has provided us with many new word/sentence/document embedding models.
arXiv Detail & Related papers (2024-02-16T21:47:30Z) - InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory [93.20588235940453]
In this paper, we introduce a training-free memory-based method, InfLLM.
InfLLM stores distant contexts into additional memory units and employs an efficient mechanism to lookup token-relevant units for attention.
Even when the sequence length is scaled to $1,024$K, InfLLM still effectively captures long-distance dependencies.
arXiv Detail & Related papers (2024-02-07T06:50:42Z) - LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning [67.39585115936329]
We argue that LLMs have inherent capabilities to handle long contexts without fine-tuning.
We propose SelfExtend to extend the context window of LLMs by constructing bi-level attention information.
We conduct comprehensive experiments on multiple benchmarks and the results show that our SelfExtend can effectively extend existing LLMs' context window length.
arXiv Detail & Related papers (2024-01-02T18:30:51Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - AlignedCoT: Prompting Large Language Models via Native-Speaking Demonstrations [52.43593893122206]
Alignedcot is an in-context learning technique for invoking Large Language Models.
It achieves consistent and correct step-wise prompts in zero-shot scenarios.
We conduct experiments on mathematical reasoning and commonsense reasoning.
arXiv Detail & Related papers (2023-11-22T17:24:21Z) - Link-Context Learning for Multimodal LLMs [40.923816691928536]
Link-context learning (LCL) emphasizes "reasoning from cause and effect" to augment the learning capabilities of MLLMs.
LCL guides the model to discern not only the analogy but also the underlying causal associations between data points.
To facilitate the evaluation of this novel approach, we introduce the ISEKAI dataset.
arXiv Detail & Related papers (2023-08-15T17:33:24Z) - LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation.
We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset.
Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z) - Augmented Large Language Models with Parametric Knowledge Guiding [72.71468058502228]
Large Language Models (LLMs) have significantly advanced natural language processing (NLP) with their impressive language understanding and generation capabilities.
Their performance may be suboptimal for domain-specific tasks that require specialized knowledge due to limited exposure to the related data.
We propose the novel Parametric Knowledge Guiding (PKG) framework, which equips LLMs with a knowledge-guiding module to access relevant knowledge.
arXiv Detail & Related papers (2023-05-08T15:05:16Z) - Auto-MLM: Improved Contrastive Learning for Self-supervised
Multi-lingual Knowledge Retrieval [7.73633850933515]
We introduce a joint training method by combining CL and Auto-MLM for self-supervised multi-lingual knowledge retrieval.
Experimental results show that our proposed approach consistently outperforms all the previous SOTA methods on both $&$ LAZADA service corpus and openly available corpora in 8 languages.
arXiv Detail & Related papers (2022-03-30T10:13:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.