Pre-training Multi-task Contrastive Learning Models for Scientific
Literature Understanding
- URL: http://arxiv.org/abs/2305.14232v2
- Date: Thu, 19 Oct 2023 01:18:49 GMT
- Title: Pre-training Multi-task Contrastive Learning Models for Scientific
Literature Understanding
- Authors: Yu Zhang, Hao Cheng, Zhihong Shen, Xiaodong Liu, Ye-Yi Wang, Jianfeng
Gao
- Abstract summary: Pre-trained language models (LMs) have shown effectiveness in scientific literature understanding tasks.
We propose a multi-task contrastive learning framework, SciMult, to facilitate common knowledge sharing across different literature understanding tasks.
- Score: 52.723297744257536
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scientific literature understanding tasks have gained significant attention
due to their potential to accelerate scientific discovery. Pre-trained language
models (LMs) have shown effectiveness in these tasks, especially when tuned via
contrastive learning. However, jointly utilizing pre-training data across
multiple heterogeneous tasks (e.g., extreme multi-label paper classification,
citation prediction, and literature search) remains largely unexplored. To
bridge this gap, we propose a multi-task contrastive learning framework,
SciMult, with a focus on facilitating common knowledge sharing across different
scientific literature understanding tasks while preventing task-specific skills
from interfering with each other. To be specific, we explore two techniques --
task-aware specialization and instruction tuning. The former adopts a
Mixture-of-Experts Transformer architecture with task-aware sub-layers; the
latter prepends task-specific instructions to the input text so as to produce
task-aware outputs. Extensive experiments on a comprehensive collection of
benchmark datasets verify the effectiveness of our task-aware specialization
strategy, where we outperform state-of-the-art scientific pre-trained LMs.
Code, datasets, and pre-trained models can be found at
https://scimult.github.io/.
Related papers
- Tint Your Models Task-wise for Improved Multi-task Model Merging [17.496018757317824]
We propose Model Tinting, a test-time approach that introduces a single task-specific layer for each task as trainable adjustments.
Our method jointly trains merging coefficients and task-specific layers, which effectively reduces task conflicts with minimal additional costs.
Our method achieves state-of-the-art performance across both computer vision and natural language processing tasks.
arXiv Detail & Related papers (2024-12-26T07:42:06Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - ULTRA-DP: Unifying Graph Pre-training with Multi-task Graph Dual Prompt [67.8934749027315]
We propose a unified framework for graph hybrid pre-training which injects the task identification and position identification into GNNs.
We also propose a novel pre-training paradigm based on a group of $k$-nearest neighbors.
arXiv Detail & Related papers (2023-10-23T12:11:13Z) - Task Compass: Scaling Multi-task Pre-training with Task Prefix [122.49242976184617]
Existing studies show that multi-task learning with large-scale supervised tasks suffers from negative effects across tasks.
We propose a task prefix guided multi-task pre-training framework to explore the relationships among tasks.
Our model can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.
arXiv Detail & Related papers (2022-10-12T15:02:04Z) - Cross-Task Knowledge Distillation in Multi-Task Recommendation [41.62428191434233]
Multi-task learning has been widely used in real-world recommenders to predict different types of user feedback.
We propose a Cross-Task Knowledge Distillation framework in recommendation, which consists of three procedures.
arXiv Detail & Related papers (2022-02-20T16:15:19Z) - Learning Multiple Dense Prediction Tasks from Partially Annotated Data [41.821234589075445]
We look at jointly learning of multiple dense prediction tasks on partially annotated data, which we call multi-task partially-supervised learning.
We propose a multi-task training procedure that successfully leverages task relations to supervise its multi-task learning when data is partially annotated.
We rigorously demonstrate that our proposed method effectively exploits the images with unlabelled tasks and outperforms existing semi-supervised learning approaches and related methods on three standard benchmarks.
arXiv Detail & Related papers (2021-11-29T19:03:12Z) - Distribution Matching for Heterogeneous Multi-Task Learning: a
Large-scale Face Study [75.42182503265056]
Multi-Task Learning has emerged as a methodology in which multiple tasks are jointly learned by a shared learning algorithm.
We deal with heterogeneous MTL, simultaneously addressing detection, classification & regression problems.
We build FaceBehaviorNet, the first framework for large-scale face analysis, by jointly learning all facial behavior tasks.
arXiv Detail & Related papers (2021-05-08T22:26:52Z) - Multi-Task Learning for Dense Prediction Tasks: A Survey [87.66280582034838]
Multi-task learning (MTL) techniques have shown promising results w.r.t. performance, computations and/or memory footprint.
We provide a well-rounded view on state-of-the-art deep learning approaches for MTL in computer vision.
arXiv Detail & Related papers (2020-04-28T09:15:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.