RPT: Toward Transferable Model on Heterogeneous Researcher Data via
Pre-Training
- URL: http://arxiv.org/abs/2110.07336v1
- Date: Fri, 8 Oct 2021 03:42:09 GMT
- Title: RPT: Toward Transferable Model on Heterogeneous Researcher Data via
Pre-Training
- Authors: Ziyue Qiao, Yanjie Fu, Pengyang Wang, Meng Xiao, Zhiyuan Ning, Yi Du,
Yuanchun Zhou
- Abstract summary: We propose a multi-task self-supervised learning-based researcher data pre-training model named RPT.
We divide the researchers' data into semantic document sets and community graph.
We propose three self-supervised learning objectives to train the whole model.
- Score: 19.987304448524043
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the growth of the academic engines, the mining and analysis acquisition
of massive researcher data, such as collaborator recommendation and researcher
retrieval, has become indispensable. It can improve the quality of services and
intelligence of academic engines. Most of the existing studies for researcher
data mining focus on a single task for a particular application scenario and
learning a task-specific model, which is usually unable to transfer to
out-of-scope tasks. The pre-training technology provides a generalized and
sharing model to capture valuable information from enormous unlabeled data. The
model can accomplish multiple downstream tasks via a few fine-tuning steps. In
this paper, we propose a multi-task self-supervised learning-based researcher
data pre-training model named RPT. Specifically, we divide the researchers'
data into semantic document sets and community graph. We design the
hierarchical Transformer and the local community encoder to capture information
from the two categories of data, respectively. Then, we propose three
self-supervised learning objectives to train the whole model. Finally, we also
propose two transfer modes of RPT for fine-tuning in different scenarios. We
conduct extensive experiments to evaluate RPT, results on three downstream
tasks verify the effectiveness of pre-training for researcher data mining.
Related papers
- Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - Divide and Conquer: Hybrid Pre-training for Person Search [40.13016375392472]
We propose a hybrid pre-training framework specifically designed for person search using sub-task data only.
Our model can achieve significant improvements across diverse protocols, such as person search method, fine-tuning data, pre-training data and model backbone.
Our code and pre-trained models are released for plug-and-play usage to the person search community.
arXiv Detail & Related papers (2023-12-13T08:33:50Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - A Survey on Generative Modeling with Limited Data, Few Shots, and Zero
Shot [33.564516823250806]
In machine learning, generative modeling aims to generate new data statistically similar to the training data distribution.
This is an important topic when data acquisition is challenging, e.g. healthcare applications.
We study interactions between different GM-DC tasks and approaches.
arXiv Detail & Related papers (2023-07-26T12:05:08Z) - Diffusion Model is an Effective Planner and Data Synthesizer for
Multi-Task Reinforcement Learning [101.66860222415512]
Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis.
For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
arXiv Detail & Related papers (2023-05-29T05:20:38Z) - Self-Supervised Visual Representation Learning Using Lightweight
Architectures [0.0]
In self-supervised learning, a model is trained to solve a pretext task, using a data set whose annotations are created by a machine.
We critically examine the most notable pretext tasks to extract features from image data.
We study the performance of various self-supervised techniques keeping all other parameters uniform.
arXiv Detail & Related papers (2021-10-21T14:13:10Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z) - Wizard of Search Engine: Access to Information Through Conversations
with Search Engines [58.53420685514819]
We make efforts to facilitate research on CIS from three aspects.
We formulate a pipeline for CIS with six sub-tasks: intent detection (ID), keyphrase extraction (KE), action prediction (AP), query selection (QS), passage selection (PS) and response generation (RG)
We release a benchmark dataset, called wizard of search engine (WISE), which allows for comprehensive and in-depth research on all aspects of CIS.
arXiv Detail & Related papers (2021-05-18T06:35:36Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.