Large Scale Subject Category Classification of Scholarly Papers with
Deep Attentive Neural Networks
- URL: http://arxiv.org/abs/2007.13826v1
- Date: Mon, 27 Jul 2020 19:42:42 GMT
- Title: Large Scale Subject Category Classification of Scholarly Papers with
Deep Attentive Neural Networks
- Authors: Bharath Kandimalla, Shaurya Rohatgi, Jian Wu and C Lee Giles
- Abstract summary: We propose a deep attentive neural network (DANN) that classifies scholarly papers using only their abstracts.
The proposed network consists of two bi-directional recurrent neural networks followed by an attention layer.
Our best model achieves micro-F1 measure of 0.76 with F1 of individual subject categories ranging from 0.50-0.95.
- Score: 15.241086410108512
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Subject categories of scholarly papers generally refer to the knowledge
domain(s) to which the papers belong, examples being computer science or
physics. Subject category information can be used for building faceted search
for digital library search engines. This can significantly assist users in
narrowing down their search space of relevant documents. Unfortunately, many
academic papers do not have such information as part of their metadata.
Existing methods for solving this task usually focus on unsupervised learning
that often relies on citation networks. However, a complete list of papers
citing the current paper may not be readily available. In particular, new
papers that have few or no citations cannot be classified using such methods.
Here, we propose a deep attentive neural network (DANN) that classifies
scholarly papers using only their abstracts. The network is trained using 9
million abstracts from Web of Science (WoS). We also use the WoS schema that
covers 104 subject categories. The proposed network consists of two
bi-directional recurrent neural networks followed by an attention layer. We
compare our model against baselines by varying the architecture and text
representation. Our best model achieves micro-F1 measure of 0.76 with F1 of
individual subject categories ranging from 0.50-0.95. The results showed the
importance of retraining word embedding models to maximize the vocabulary
overlap and the effectiveness of the attention mechanism. The combination of
word vectors with TFIDF outperforms character and sentence level embedding
models. We discuss imbalanced samples and overlapping categories and suggest
possible strategies for mitigation. We also determine the subject category
distribution in CiteSeerX by classifying a random sample of one million
academic papers.
Related papers
- Interactive Distillation of Large Single-Topic Corpora of Scientific
Papers [1.2954493726326113]
A more robust but time-consuming approach is to build the dataset constructively in which a subject matter expert handpicks documents.
Here we showcase a new tool, based on machine learning, for constructively generating targeted datasets of scientific literature.
arXiv Detail & Related papers (2023-09-19T17:18:36Z) - Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts [91.3755431537592]
The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases.
Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding.
This study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery.
arXiv Detail & Related papers (2023-07-05T20:16:20Z) - Weakly Supervised Multi-Label Classification of Full-Text Scientific
Papers [29.295941972777978]
We proposeEX, a framework that uses the cross-paper network structure and the in-paper hierarchy structure to classify full-text scientific papers under weak supervision.
A network-aware contrastive fine-tuning module and a hierarchy-aware aggregation module are designed to leverage the two types of structural signals.
arXiv Detail & Related papers (2023-06-24T15:27:55Z) - Multi-task recommendation system for scientific papers with high-way
networks [1.5229257192293197]
We present a multi-task recommendation system (RS) that predicts a paper recommendation and generates its meta-data such as keywords.
The motivation behind this approach is that the paper's topics expressed as keywords are a useful predictor of preferences of researchers.
Our application uses Highway networks to train the system very deep, combine the benefits of RNN and CNN to find the most important factor and make latent representation.
arXiv Detail & Related papers (2022-04-21T07:40:47Z) - Enhancing Scientific Papers Summarization with Citation Graph [78.65955304229863]
We redefine the task of scientific papers summarization by utilizing their citation graph.
We construct a novel scientific papers summarization dataset Semantic Scholar Network (SSN) which contains 141K research papers in different domains.
Our model can achieve competitive performance when compared with the pretrained models.
arXiv Detail & Related papers (2021-04-07T11:13:35Z) - Minimally-Supervised Structure-Rich Text Categorization via Learning on
Text-Rich Networks [61.23408995934415]
We propose a novel framework for minimally supervised categorization by learning from the text-rich network.
Specifically, we jointly train two modules with different inductive biases -- a text analysis module for text understanding and a network learning module for class-discriminative, scalable network learning.
Our experiments show that given only three seed documents per category, our framework can achieve an accuracy of about 92%.
arXiv Detail & Related papers (2021-02-23T04:14:34Z) - Be More with Less: Hypergraph Attention Networks for Inductive Text
Classification [56.98218530073927]
Graph neural networks (GNNs) have received increasing attention in the research community and demonstrated their promising results on this canonical task.
Despite the success, their performance could be largely jeopardized in practice since they are unable to capture high-order interaction between words.
We propose a principled model -- hypergraph attention networks (HyperGAT) which can obtain more expressive power with less computational consumption for text representation learning.
arXiv Detail & Related papers (2020-11-01T00:21:59Z) - Pairwise Learning for Name Disambiguation in Large-Scale Heterogeneous
Academic Networks [81.00481125272098]
We introduce Multi-view Attention-based Pairwise Recurrent Neural Network (MA-PairRNN) to solve the name disambiguation problem.
MA-PairRNN combines heterogeneous graph embedding learning and pairwise similarity learning into a framework.
Results on two real-world datasets demonstrate that our framework has a significant and consistent improvement of performance on the name disambiguation task.
arXiv Detail & Related papers (2020-08-30T06:08:20Z) - Graph Prototypical Networks for Few-shot Learning on Attributed Networks [72.31180045017835]
We propose a graph meta-learning framework -- Graph Prototypical Networks (GPN)
GPN is able to perform textitmeta-learning on an attributed network and derive a highly generalizable model for handling the target classification task.
arXiv Detail & Related papers (2020-06-23T04:13:23Z) - Segmenting Scientific Abstracts into Discourse Categories: A Deep
Learning-Based Approach for Sparse Labeled Data [8.635930195821265]
We train a deep neural network on structured abstracts from PubMed to fine-tune it on a small hand-labeled corpus of computer science papers.
Our method appears to be a promising solution to the automatic segmentation of abstracts, where the data is sparse.
arXiv Detail & Related papers (2020-05-11T20:21:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.