PALT: Parameter-Lite Transfer of Language Models for Knowledge Graph
Completion
- URL: http://arxiv.org/abs/2210.13715v1
- Date: Tue, 25 Oct 2022 02:22:29 GMT
- Title: PALT: Parameter-Lite Transfer of Language Models for Knowledge Graph
Completion
- Authors: Jianhao Shen, Chenguang Wang, Ye Yuan, Jiawei Han, Heng Ji, Koushik
Sen, Ming Zhang, Dawn Song
- Abstract summary: This paper presents a parameter-lite transfer learning approach of pretrained language models (LM) for knowledge graph (KG) completion.
Instead of finetuning, which modifies all LM parameters, we only tune a few new parameters while keeping the original LM parameters fixed.
We show that, by tuning far fewer parameters than finetuning, LMs transfer non-trivially to most tasks and reach competitiveness with prior state-of-the-art approaches.
- Score: 108.8941541255567
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a parameter-lite transfer learning approach of pretrained
language models (LM) for knowledge graph (KG) completion. Instead of
finetuning, which modifies all LM parameters, we only tune a few new parameters
while keeping the original LM parameters fixed. We establish this via
reformulating KG completion as a "fill-in-the-blank" task, and introducing a
parameter-lite encoder on top of the original LMs. We show that, by tuning far
fewer parameters than finetuning, LMs transfer non-trivially to most tasks and
reach competitiveness with prior state-of-the-art approaches. For instance, we
outperform the fully finetuning approaches on a KG completion benchmark by
tuning only 1% of the parameters. The code and datasets are available at
\url{https://github.com/yuanyehome/PALT}.
Related papers
- Vision-Language Model Fine-Tuning via Simple Parameter-Efficient Modification [46.25272949924458]
It is believed that fine-tuning the parameters of VLMs corrupts the pre-trained knowledge since fine-tuning the CLIP model even degrades performance.
We propose ClipFit, a method to fine-tune CLIP without introducing any overhead of extra parameters.
We demonstrate that ClipFit can improve the performance of zero-shot CLIP by 7.27% average harmonic mean accuracy.
arXiv Detail & Related papers (2024-09-25T08:07:18Z) - Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes [53.4856038354195]
Pre-trained large language models (LLMs) need fine-tuning to improve their responsiveness to natural language instructions.
FedKSeed employs zeroth-order optimization with a finite set of random seeds.
It significantly reduces transmission requirements between the server and clients to just a few random seeds.
arXiv Detail & Related papers (2023-12-11T13:03:21Z) - QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources [37.265708531464746]
Large Language Models (LLMs) have showcased remarkable impacts across a wide spectrum of natural language processing tasks.
Fine-tuning these pre-trained models on downstream datasets provides further significant performance gains, but this process has been challenging due to its extraordinary resource requirements.
We propose QFT, a novel Quantized Full- parameter Tuning framework for LLMs that enables memory-efficient fine-tuning without harming performance.
arXiv Detail & Related papers (2023-10-11T02:47:40Z) - Parameter-Efficient Fine-Tuning without Introducing New Latency [7.631596468553607]
We introduce a novel adapter technique that directly applies the adapter to pre-trained parameters instead of the hidden representation.
Our proposed method attains a new state-of-the-art outcome in terms of both performance and storage efficiency, storing only 0.03% parameters of full fine-tuning.
arXiv Detail & Related papers (2023-05-26T08:44:42Z) - Parameter-efficient Tuning of Large-scale Multimodal Foundation Model [68.24510810095802]
We propose A graceful prompt framework for cross-modal transfer (Aurora) to overcome these challenges.
Considering the redundancy in existing architectures, we first utilize the mode approximation to generate 0.1M trainable parameters to implement the multimodal prompt tuning.
A thorough evaluation on six cross-modal benchmarks shows that it not only outperforms the state-of-the-art but even outperforms the full fine-tuning approach.
arXiv Detail & Related papers (2023-05-15T06:40:56Z) - Evaluating Parameter-Efficient Transfer Learning Approaches on SURE
Benchmark for Speech Understanding [40.27182770995891]
Fine-tuning is widely used as the default algorithm for transfer learning from pre-trained models.
We introduce the Speech UndeRstanding Evaluation (SURE) benchmark for parameter-efficient learning for various speech-processing tasks.
arXiv Detail & Related papers (2023-03-02T08:57:33Z) - Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than
In-Context Learning [81.3514358542452]
Few-shot in-context learning (ICL) incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.
parameter-efficient fine-tuning offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task.
In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
arXiv Detail & Related papers (2022-05-11T17:10:41Z) - Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with
Language Models [48.0311578882384]
finetuning language models (LMs) with training examples and task descriptions has been seen as critical to recent successes in few-shot learning.
We show that finetuning LMs in the few-shot setting can considerably reduce the need for prompt engineering.
arXiv Detail & Related papers (2021-06-24T23:38:10Z) - Prefix-Tuning: Optimizing Continuous Prompts for Generation [85.6357778621526]
Fine-tuning is the de facto way to leverage large pretrained language models to perform downstream tasks.
We propose prefix-tuning, a lightweight alternative to fine-tuning for natural language generation tasks.
We find that by learning only 0.1% of the parameters, prefix-tuning obtains comparable performance in the full data setting.
arXiv Detail & Related papers (2021-01-01T08:00:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.