Parameter-Efficient Transfer from Sequential Behaviors for User Modeling
and Recommendation
- URL: http://arxiv.org/abs/2001.04253v4
- Date: Tue, 9 Jun 2020 12:36:19 GMT
- Title: Parameter-Efficient Transfer from Sequential Behaviors for User Modeling
and Recommendation
- Authors: Fajie Yuan, Xiangnan He, Alexandros Karatzoglou, Liguang Zhang
- Abstract summary: In this paper, we develop a parameter efficient transfer learning architecture, termed as PeterRec.
PeterRec allows the pre-trained parameters to remain unaltered during fine-tuning by injecting a series of re-learned neural networks.
We perform extensive experimental ablation to show the effectiveness of the learned user representation in five downstream tasks.
- Score: 111.44445634272235
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inductive transfer learning has had a big impact on computer vision and NLP
domains but has not been used in the area of recommender systems. Even though
there has been a large body of research on generating recommendations based on
modeling user-item interaction sequences, few of them attempt to represent and
transfer these models for serving downstream tasks where only limited data
exists.
In this paper, we delve on the task of effectively learning a single user
representation that can be applied to a diversity of tasks, from cross-domain
recommendations to user profile predictions. Fine-tuning a large pre-trained
network and adapting it to downstream tasks is an effective way to solve such
tasks. However, fine-tuning is parameter inefficient considering that an entire
model needs to be re-trained for every new task. To overcome this issue, we
develop a parameter efficient transfer learning architecture, termed as
PeterRec, which can be configured on-the-fly to various downstream tasks.
Specifically, PeterRec allows the pre-trained parameters to remain unaltered
during fine-tuning by injecting a series of re-learned neural networks, which
are small but as expressive as learning the entire network. We perform
extensive experimental ablation to show the effectiveness of the learned user
representation in five downstream tasks. Moreover, we show that PeterRec
performs efficient transfer learning in multiple domains, where it achieves
comparable or sometimes better performance relative to fine-tuning the entire
model parameters. Codes and datasets are available at
https://github.com/fajieyuan/sigir2020_peterrec.
Related papers
- Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis [51.14136878142034]
Point cloud analysis has achieved outstanding performance by transferring point cloud pre-trained models.
Existing methods for model adaptation usually update all model parameters, which is inefficient as it relies on high computational costs.
In this paper, we aim to study parameter-efficient transfer learning for point cloud analysis with an ideal trade-off between task performance and parameter efficiency.
arXiv Detail & Related papers (2024-03-03T08:25:04Z) - Prototype-based HyperAdapter for Sample-Efficient Multi-task Tuning [30.251155072822055]
Prototype-based HyperAdapter (PHA) is a novel framework built on the adapter-tuning and hypernetwork.
It introduces an instance-dense retriever and prototypical hypernetwork to generate conditional modules in a sample-efficient manner.
We show that PHA strikes a better trade-off between trainable parameters, accuracy on stream tasks, and sample efficiency.
arXiv Detail & Related papers (2023-10-18T02:42:17Z) - $\Delta$-Patching: A Framework for Rapid Adaptation of Pre-trained
Convolutional Networks without Base Performance Loss [71.46601663956521]
Models pre-trained on large-scale datasets are often fine-tuned to support newer tasks and datasets that arrive over time.
We propose $Delta$-Patching for fine-tuning neural network models in an efficient manner, without the need to store model copies.
Our experiments show that $Delta$-Networks outperform earlier model patching work while only requiring a fraction of parameters to be trained.
arXiv Detail & Related papers (2023-03-26T16:39:44Z) - Scalable Weight Reparametrization for Efficient Transfer Learning [10.265713480189486]
Efficient transfer learning involves utilizing a pre-trained model trained on a larger dataset and repurposing it for downstream tasks.
Previous works have led to an increase in updated parameters and task-specific modules, resulting in more computations, especially for tiny models.
We suggest learning a policy network that can decide where to reparametrize the pre-trained model, while adhering to a given constraint for the number of updated parameters.
arXiv Detail & Related papers (2023-02-26T23:19:11Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than
In-Context Learning [81.3514358542452]
Few-shot in-context learning (ICL) incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.
parameter-efficient fine-tuning offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task.
In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
arXiv Detail & Related papers (2022-05-11T17:10:41Z) - Training Neural Networks with Fixed Sparse Masks [19.58969772430058]
Recent work has shown that it is possible to update only a small subset of the model's parameters during training.
We show that it is possible to induce a fixed sparse mask on the model's parameters that selects a subset to update over many iterations.
arXiv Detail & Related papers (2021-11-18T18:06:01Z) - Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network.
PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.