Vision-Based Layout Detection from Scientific Literature using Recurrent
Convolutional Neural Networks
- URL: http://arxiv.org/abs/2010.11727v1
- Date: Sun, 18 Oct 2020 23:50:28 GMT
- Title: Vision-Based Layout Detection from Scientific Literature using Recurrent
Convolutional Neural Networks
- Authors: Huichen Yang, William H. Hsu
- Abstract summary: We present an approach for adapting convolutional neural networks for object recognition and classification to scientific literature layout detection (SLLD)
SLLD is a shared subtask of several information extraction problems.
Our results show good improvement with fine-tuning of a pre-trained base network.
- Score: 12.221478896815292
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an approach for adapting convolutional neural networks for object
recognition and classification to scientific literature layout detection
(SLLD), a shared subtask of several information extraction problems. Scientific
publications contain multiple types of information sought by researchers in
various disciplines, organized into an abstract, bibliography, and sections
documenting related work, experimental methods, and results; however, there is
no effective way to extract this information due to their diverse layout. In
this paper, we present a novel approach to developing an end-to-end learning
framework to segment and classify major regions of a scientific document. We
consider scientific document layout analysis as an object detection task over
digital images, without any additional text features that need to be added into
the network during the training process. Our technical objective is to
implement transfer learning via fine-tuning of pre-trained networks and thereby
demonstrate that this deep learning architecture is suitable for tasks that
lack very large document corpora for training ab initio. As part of the
experimental test bed for empirical evaluation of this approach, we created a
merged multi-corpus data set for scientific publication layout detection tasks.
Our results show good improvement with fine-tuning of a pre-trained base
network using this merged data set, compared to the baseline convolutional
neural network architecture.
Related papers
- A Sentiment Analysis of Medical Text Based on Deep Learning [1.8130068086063336]
This paper focuses on the medical domain, using bidirectional encoder representations from transformers (BERT) as the basic pre-trained model.
Experiments and analyses were conducted on the METS-CoV dataset to explore the training performance after integrating different deep learning networks.
CNN models outperform other networks when trained on smaller medical text datasets in combination with pre-trained models like BERT.
arXiv Detail & Related papers (2024-04-16T12:20:49Z) - Data Augmentations in Deep Weight Spaces [89.45272760013928]
We introduce a novel augmentation scheme based on the Mixup method.
We evaluate the performance of these techniques on existing benchmarks as well as new benchmarks we generate.
arXiv Detail & Related papers (2023-11-15T10:43:13Z) - Pre-training Multi-task Contrastive Learning Models for Scientific
Literature Understanding [52.723297744257536]
Pre-trained language models (LMs) have shown effectiveness in scientific literature understanding tasks.
We propose a multi-task contrastive learning framework, SciMult, to facilitate common knowledge sharing across different literature understanding tasks.
arXiv Detail & Related papers (2023-05-23T16:47:22Z) - Human-in-the-Loop Disinformation Detection: Stance, Sentiment, or
Something Else? [93.91375268580806]
Both politics and pandemics have recently provided ample motivation for the development of machine learning-enabled disinformation (a.k.a. fake news) detection algorithms.
Existing literature has focused primarily on the fully-automated case, but the resulting techniques cannot reliably detect disinformation on the varied topics, sources, and time scales required for military applications.
By leveraging an already-available analyst as a human-in-the-loop, canonical machine learning techniques of sentiment analysis, aspect-based sentiment analysis, and stance detection become plausible methods to use for a partially-automated disinformation detection system.
arXiv Detail & Related papers (2021-11-09T13:30:34Z) - Exploiting the relationship between visual and textual features in
social networks for image classification with zero-shot deep learning [0.0]
In this work, we propose a classifier ensemble based on the transferable learning capabilities of the CLIP neural network architecture.
Our experiments, based on image classification tasks according to the labels of the Places dataset, are performed by first considering only the visual part.
Considering the associated texts to the images can help to improve the accuracy depending on the goal.
arXiv Detail & Related papers (2021-07-08T10:54:59Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Learning to Segment Human Body Parts with Synthetically Trained Deep
Convolutional Networks [58.0240970093372]
This paper presents a new framework for human body part segmentation based on Deep Convolutional Neural Networks trained using only synthetic data.
The proposed approach achieves cutting-edge results without the need of training the models with real annotated data of human body parts.
arXiv Detail & Related papers (2021-02-02T12:26:50Z) - NAS-Navigator: Visual Steering for Explainable One-Shot Deep Neural
Network Synthesis [53.106414896248246]
We present a framework that allows analysts to effectively build the solution sub-graph space and guide the network search by injecting their domain knowledge.
Applying this technique in an iterative manner allows analysts to converge to the best performing neural network architecture for a given application.
arXiv Detail & Related papers (2020-09-28T01:48:45Z) - Multi-Subspace Neural Network for Image Recognition [33.61205842747625]
In image classification task, feature extraction is always a big issue. Intra-class variability increases the difficulty in designing the extractors.
Recently, deep learning has drawn lots of attention on automatically learning features from data.
In this study, we proposed multi-subspace neural network (MSNN) which integrates key components of the convolutional neural network (CNN), receptive field, with subspace concept.
arXiv Detail & Related papers (2020-06-17T02:55:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.