Pre-Trained Models for Heterogeneous Information Networks
- URL: http://arxiv.org/abs/2007.03184v2
- Date: Tue, 18 May 2021 09:53:57 GMT
- Title: Pre-Trained Models for Heterogeneous Information Networks
- Authors: Yang Fang, Xiang Zhao, Yifan Chen, Weidong Xiao, Maarten de Rijke
- Abstract summary: We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network.
PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
- Score: 57.78194356302626
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In network representation learning we learn how to represent heterogeneous
information networks in a low-dimensional space so as to facilitate effective
search, classification, and prediction solutions. Previous network
representation learning methods typically require sufficient task-specific
labeled data to address domain-specific problems. The trained model usually
cannot be transferred to out-of-domain datasets. We propose a self-supervised
pre-training and fine-tuning framework, PF-HIN, to capture the features of a
heterogeneous information network. Unlike traditional network representation
learning models that have to train the entire model all over again for every
downstream task and dataset, PF-HIN only needs to fine-tune the model and a
small number of extra task-specific parameters, thus improving model efficiency
and effectiveness. During pre-training, we first transform the neighborhood of
a given node into a sequence. PF-HIN is pre-trained based on two
self-supervised tasks, masked node modeling and adjacent node prediction. We
adopt deep bi-directional transformer encoders to train the model, and leverage
factorized embedding parameterization and cross-layer parameter sharing to
reduce the parameters. In the fine-tuning stage, we choose four benchmark
downstream tasks, i.e., link prediction, similarity search, node
classification, and node clustering. PF-HIN consistently and significantly
outperforms state-of-the-art alternatives on each of these tasks, on four
datasets.
Related papers
- Distributed Learning over Networks with Graph-Attention-Based
Personalization [49.90052709285814]
We propose a graph-based personalized algorithm (GATTA) for distributed deep learning.
In particular, the personalized model in each agent is composed of a global part and a node-specific part.
By treating each agent as one node in a graph the node-specific parameters as its features, the benefits of the graph attention mechanism can be inherited.
arXiv Detail & Related papers (2023-05-22T13:48:30Z) - Prompt Tuning for Parameter-efficient Medical Image Segmentation [79.09285179181225]
We propose and investigate several contributions to achieve a parameter-efficient but effective adaptation for semantic segmentation on two medical imaging datasets.
We pre-train this architecture with a dedicated dense self-supervision scheme based on assignments to online generated prototypes.
We demonstrate that the resulting neural network model is able to attenuate the gap between fully fine-tuned and parameter-efficiently adapted models.
arXiv Detail & Related papers (2022-11-16T21:55:05Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Transfer Learning via Test-Time Neural Networks Aggregation [11.42582922543676]
It has been demonstrated that deep neural networks outperform traditional machine learning.
Deep networks lack generalisability, that is, they will not perform as good as in a new (testing) set drawn from a different distribution.
arXiv Detail & Related papers (2022-06-27T15:46:05Z) - Transfer Learning with Convolutional Networks for Atmospheric Parameter
Retrieval [14.131127382785973]
The Infrared Atmospheric Sounding Interferometer (IASI) on board the MetOp satellite series provides important measurements for Numerical Weather Prediction (NWP)
Retrieving accurate atmospheric parameters from the raw data provided by IASI is a large challenge, but necessary in order to use the data in NWP models.
We show how features extracted from the IASI data by a CNN trained to predict a physical variable can be used as inputs to another statistical method designed to predict a different physical variable at low altitude.
arXiv Detail & Related papers (2020-12-09T09:28:42Z) - Parameter-Efficient Transfer from Sequential Behaviors for User Modeling
and Recommendation [111.44445634272235]
In this paper, we develop a parameter efficient transfer learning architecture, termed as PeterRec.
PeterRec allows the pre-trained parameters to remain unaltered during fine-tuning by injecting a series of re-learned neural networks.
We perform extensive experimental ablation to show the effectiveness of the learned user representation in five downstream tasks.
arXiv Detail & Related papers (2020-01-13T14:09:54Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.