Sequential Sentence Classification in Research Papers using Cross-Domain
Multi-Task Learning
- URL: http://arxiv.org/abs/2102.06008v1
- Date: Thu, 11 Feb 2021 13:54:10 GMT
- Title: Sequential Sentence Classification in Research Papers using Cross-Domain
Multi-Task Learning
- Authors: Arthur Brack and Anett Hoppe and Pascal Buscherm\"ohle and Ralph
Ewerth
- Abstract summary: We propose a uniform deep learning architecture and multi-task learning to improve sequential sentence classification in scientific texts across domains.
Our approach outperforms the state of the art on three benchmark datasets.
- Score: 4.2443814047515716
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The task of sequential sentence classification enables the semantic
structuring of research papers. This can enhance academic search engines to
support researchers in finding and exploring research literature more
effectively. However, previous work has not investigated the potential of
transfer learning with datasets from different scientific domains for this task
yet. We propose a uniform deep learning architecture and multi-task learning to
improve sequential sentence classification in scientific texts across domains
by exploiting training data from multiple domains. Our contributions can be
summarised as follows: (1) We tailor two common transfer learning methods,
sequential transfer learning and multi-task learning, and evaluate their
performance for sequential sentence classification; (2) The presented
multi-task model is able to recognise semantically related classes from
different datasets and thus supports manual comparison and assessment of
different annotation schemes; (3) The unified approach is capable of handling
datasets that contain either only abstracts or full papers without further
feature engineering. We demonstrate that models, which are trained on datasets
from different scientific domains, benefit from one another when using the
proposed multi-task learning architecture. Our approach outperforms the state
of the art on three benchmark datasets.
Related papers
- M3: A Multi-Task Mixed-Objective Learning Framework for Open-Domain Multi-Hop Dense Sentence Retrieval [12.277521531556852]
M3 is a novel Multi-hop dense sentence retrieval system built upon a novel Multi-task Mixed-objective approach for dense text representation learning.
Our approach yields state-of-the-art performance on a large-scale open-domain fact verification benchmark dataset, FEVER.
arXiv Detail & Related papers (2024-03-21T01:52:07Z) - UniDoc: A Universal Large Multimodal Model for Simultaneous Text
Detection, Recognition, Spotting and Understanding [93.92313947913831]
We introduce UniDoc, a novel multimodal model equipped with text detection and recognition capabilities.
To the best of our knowledge, this is the first large multimodal model capable of simultaneous text detection, recognition, spotting, and understanding.
arXiv Detail & Related papers (2023-08-19T17:32:34Z) - Pre-training Multi-task Contrastive Learning Models for Scientific
Literature Understanding [52.723297744257536]
Pre-trained language models (LMs) have shown effectiveness in scientific literature understanding tasks.
We propose a multi-task contrastive learning framework, SciMult, to facilitate common knowledge sharing across different literature understanding tasks.
arXiv Detail & Related papers (2023-05-23T16:47:22Z) - FETA: A Benchmark for Few-Sample Task Transfer in Open-Domain Dialogue [70.65782786401257]
This work explores conversational task transfer by introducing FETA: a benchmark for few-sample task transfer in open-domain dialogue.
FETA contains two underlying sets of conversations upon which there are 10 and 7 tasks annotated, enabling the study of intra-dataset task transfer.
We utilize three popular language models and three learning algorithms to analyze the transferability between 132 source-target task pairs.
arXiv Detail & Related papers (2022-05-12T17:59:00Z) - An Approach for Combining Multimodal Fusion and Neural Architecture
Search Applied to Knowledge Tracing [6.540879944736641]
We propose a sequential model based optimization approach that combines multimodal fusion and neural architecture search within one framework.
We evaluate our methods on two public real datasets showing the discovered model is able to achieve superior performance.
arXiv Detail & Related papers (2021-11-08T13:43:46Z) - Meta Navigator: Search for a Good Adaptation Policy for Few-shot
Learning [113.05118113697111]
Few-shot learning aims to adapt knowledge learned from previous tasks to novel tasks with only a limited amount of labeled data.
Research literature on few-shot learning exhibits great diversity, while different algorithms often excel at different few-shot learning scenarios.
We present Meta Navigator, a framework that attempts to solve the limitation in few-shot learning by seeking a higher-level strategy.
arXiv Detail & Related papers (2021-09-13T07:20:01Z) - Multimodal Clustering Networks for Self-supervised Learning from
Unlabeled Videos [69.61522804742427]
This paper proposes a self-supervised training framework that learns a common multimodal embedding space.
We extend the concept of instance-level contrastive learning with a multimodal clustering step to capture semantic similarities across modalities.
The resulting embedding space enables retrieval of samples across all modalities, even from unseen datasets and different domains.
arXiv Detail & Related papers (2021-04-26T15:55:01Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - Vision-Based Layout Detection from Scientific Literature using Recurrent
Convolutional Neural Networks [12.221478896815292]
We present an approach for adapting convolutional neural networks for object recognition and classification to scientific literature layout detection (SLLD)
SLLD is a shared subtask of several information extraction problems.
Our results show good improvement with fine-tuning of a pre-trained base network.
arXiv Detail & Related papers (2020-10-18T23:50:28Z) - Two Huge Title and Keyword Generation Corpora of Research Articles [0.0]
We introduce two huge datasets for text summarization (OAGSX) and keyword generation (OAGKX) research.
The data were retrieved from the Open Academic Graph which is a network of research profiles and publications.
We would like to apply topic modeling on the two sets to derive subsets of research articles from more specific disciplines.
arXiv Detail & Related papers (2020-02-11T21:17:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.