Interweaving Memories of a Siamese Large Language Model
- URL: http://arxiv.org/abs/2412.17383v1
- Date: Mon, 23 Dec 2024 08:33:47 GMT
- Title: Interweaving Memories of a Siamese Large Language Model
- Authors: Xin Song, Zhikai Xue, Guoxiu He, Jiawei Liu, Wei Lu,
- Abstract summary: We propose a model-agnostic PEFT framework, which Interweaves Memories of a Siamese Large Language Model.
Our findings indicate that IMSM maintains comparable time and space efficiency to backbone PEFT methods.
- Score: 9.60026229476874
- License:
- Abstract: Parameter-efficient fine-tuning (PEFT) methods optimize large language models (LLMs) by modifying or introducing a small number of parameters to enhance alignment with downstream tasks. However, they can result in catastrophic forgetting, where LLMs prioritize new knowledge at the expense of comprehensive world knowledge. A promising approach to mitigate this issue is to recall prior memories based on the original knowledge. To this end, we propose a model-agnostic PEFT framework, IMSM, which Interweaves Memories of a Siamese Large Language Model. Specifically, our siamese LLM is equipped with an existing PEFT method. Given an incoming query, it generates two distinct memories based on the pre-trained and fine-tuned parameters. IMSM then incorporates an interweaving mechanism that regulates the contributions of both original and enhanced memories when generating the next token. This framework is theoretically applicable to all open-source LLMs and existing PEFT methods. We conduct extensive experiments across various benchmark datasets, evaluating the performance of popular open-source LLMs using the proposed IMSM, in comparison to both classical and leading PEFT methods. Our findings indicate that IMSM maintains comparable time and space efficiency to backbone PEFT methods while significantly improving performance and effectively mitigating catastrophic forgetting.
Related papers
- LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization [59.75242204923353]
We introduce LLM-Lasso, a framework that leverages large language models (LLMs) to guide feature selection in Lasso regression.
LLMs generate penalty factors for each feature, which are converted into weights for the Lasso penalty using a simple, tunable model.
Features identified as more relevant by the LLM receive lower penalties, increasing their likelihood of being retained in the final model.
arXiv Detail & Related papers (2025-02-15T02:55:22Z) - Adaptive Pruning for Large Language Models with Structural Importance Awareness [66.2690963378878]
Large language models (LLMs) have significantly improved language understanding and generation capabilities.
LLMs are difficult to deploy on resource-constrained edge devices due to their high computational and storage resource demands.
We propose structurally-aware adaptive pruning (SAAP) to significantly reduce the computational and memory costs while maintaining model performance.
arXiv Detail & Related papers (2024-12-19T18:08:04Z) - Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models [19.163639128631534]
Importance-aware Sparse Tuning (IST) is a plug-and-play technique compatible with various PEFT methods that operate on a per-layer basis.
IST dynamically updates selected layers in PEFT modules, leading to reduced memory demands.
arXiv Detail & Related papers (2024-10-15T16:53:26Z) - LLMEmb: Large Language Model Can Be a Good Embedding Generator for Sequential Recommendation [57.49045064294086]
Large Language Model (LLM) has the ability to capture semantic relationships between items, independent of their popularity.
We introduce LLMEmb, a novel method leveraging LLM to generate item embeddings that enhance Sequential Recommender Systems (SRS) performance.
arXiv Detail & Related papers (2024-09-30T03:59:06Z) - FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models [50.331708897857574]
We introduce FactorLLM, a novel approach that decomposes well-trained dense FFNs into sparse sub-networks without requiring any further modifications.
FactorLLM achieves comparable performance to the source model securing up to 85% model performance while obtaining over a 30% increase in inference speed.
arXiv Detail & Related papers (2024-08-15T16:45:16Z) - SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models [14.202759186103497]
Multimodal large language models (MLLMs) have demonstrated remarkable capabilities in multimodal tasks.
However, fine-tuning all parameters of MLLMs has become challenging as they usually contain billions of parameters.
This paper conducts empirical studies using four popular PEFT methods to fine-tune the LLM component of open-source MLLMs.
arXiv Detail & Related papers (2024-06-07T17:58:11Z) - Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models [90.14693869269519]
MoE LLMs can achieve higher performance with fewer parameters, but it is still hard to deploy them due to their immense parameter sizes.
This paper mainly aims to enhance the deployment efficiency of MoE LLMs by introducing plug-and-play expert-level sparsification techniques.
arXiv Detail & Related papers (2024-02-22T18:56:07Z) - Exploring Parameter-Efficient Fine-Tuning Techniques for Code Generation with Large Language Models [11.845239346943067]
parameter-efficient fine-tuning (PEFT) is a promising approach to efficiently specialize large language models (LLMs) to task-specific data.
Our study highlights the potential for tuning larger LLMs and significant reductions in memory usage by combining PEFT with quantization.
arXiv Detail & Related papers (2023-08-21T04:31:06Z) - Make Pre-trained Model Reversible: From Parameter to Memory Efficient
Fine-Tuning [6.451743797015637]
We propose memory-efficient fine-tuning (MEFT) for pre-trained language models.
MEFT inserts adapters into a PLM, preserving the PLM's starting point and making it reversible without additional pre-training.
MEFT significantly reduces the activation memory up to 84% of full fine-tuning with a negligible amount of trainable parameters.
arXiv Detail & Related papers (2023-06-01T09:26:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.