Using Transformer based Ensemble Learning to classify Scientific
Articles
- URL: http://arxiv.org/abs/2102.09991v1
- Date: Fri, 19 Feb 2021 15:42:26 GMT
- Title: Using Transformer based Ensemble Learning to classify Scientific
Articles
- Authors: Sohom Ghosh and Ankush Chopra
- Abstract summary: It comprises four independent sub-systems capable of classifying abstracts of scientific literature to one of the given seven classes.
We ensemble predictions of these four sub-systems using majority voting to develop the final system which gives a F1 score of 0.93 on the test and validation set.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many time reviewers fail to appreciate novel ideas of a researcher and
provide generic feedback. Thus, proper assignment of reviewers based on their
area of expertise is necessary. Moreover, reading each and every paper from
end-to-end for assigning it to a reviewer is a tedious task. In this paper, we
describe a system which our team FideLIPI submitted in the shared task of
SDPRA-2021 [14]. It comprises four independent sub-systems capable of
classifying abstracts of scientific literature to one of the given seven
classes. The first one is a RoBERTa [10] based model built over these
abstracts. Adding topic models / Latent dirichlet allocation (LDA) [2] based
features to the first model results in the second sub-system. The third one is
a sentence level RoBERTa [10] model. The fourth one is a Logistic Regression
model built using Term Frequency Inverse Document Frequency (TF-IDF) features.
We ensemble predictions of these four sub-systems using majority voting to
develop the final system which gives a F1 score of 0.93 on the test and
validation set. This outperforms the existing State Of The Art (SOTA) model
SciBERT's [1] in terms of F1 score on the validation set.Our codebase is
available at https://github.com/SDPRA-2021/shared-task/tree/main/FideLIPI
Related papers
- News Summarization and Evaluation in the Era of GPT-3 [73.48220043216087]
We study how GPT-3 compares against fine-tuned models trained on large summarization datasets.
We show that not only do humans overwhelmingly prefer GPT-3 summaries, prompted using only a task description, but these also do not suffer from common dataset-specific issues such as poor factuality.
arXiv Detail & Related papers (2022-09-26T01:04:52Z) - Detecting Generated Scientific Papers using an Ensemble of Transformer
Models [4.56877715768796]
The paper describes neural models developed for the DAGPap22 shared task hosted at the Third Workshop on Scholarly Document Processing.
Our work focuses on comparing different transformer-based models as well as using additional datasets and techniques to deal with imbalanced classes.
arXiv Detail & Related papers (2022-09-17T08:43:25Z) - Using contextual sentence analysis models to recognize ESG concepts [8.905370601886112]
This paper summarizes the joint participation of the Trading Central Labs and the L3i laboratory of the University of La Rochelle on two sub-tasks of the FinSim-4 evaluation campaign.
The first sub-task aims to enrich the 'Fortia ESG taxonomy' with new lexicon entries while the second one aims to classify sentences to either'sustainable' or 'unsustainable' with respect to ESG related factors.
arXiv Detail & Related papers (2022-07-04T13:33:21Z) - SimCPSR: Simple Contrastive Learning for Paper Submission Recommendation
System [0.0]
This study proposes a transformer-based model using transfer learning as an efficient approach for the paper submission recommendation system.
By combining essential information (such as the title, the abstract, and the list of keywords) with the aims and scopes of journals, the model can recommend the Top K journals that maximize the acceptance of the paper.
arXiv Detail & Related papers (2022-05-12T08:08:22Z) - Unifying Language Learning Paradigms [96.35981503087567]
We present a unified framework for pre-training models that are universally effective across datasets and setups.
We show how different pre-training objectives can be cast as one another and how interpolating between different objectives can be effective.
Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.
arXiv Detail & Related papers (2022-05-10T19:32:20Z) - Joint Models for Answer Verification in Question Answering Systems [85.93456768689404]
We build a three-way multi-classifier, which decides if an answer supports, refutes, or is neutral with respect to another one.
We tested our models on WikiQA, TREC-QA, and a real-world dataset.
arXiv Detail & Related papers (2021-07-09T05:34:36Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - UIUC_BioNLP at SemEval-2021 Task 11: A Cascade of Neural Models for
Structuring Scholarly NLP Contributions [1.5942130010323128]
We propose a cascade of neural models that performs sentence classification, phrase recognition, and triple extraction.
A BERT-CRF model was used to recognize and characterize relevant phrases in contribution sentences.
Our system was officially ranked second in Phase 1 evaluation and first in both parts of Phase 2 evaluation.
arXiv Detail & Related papers (2021-05-12T05:24:35Z) - KnowGraph@IITK at SemEval-2021 Task 11: Building KnowledgeGraph for NLP
Research [2.1012672709024294]
We develop a system for a research paper contributions-focused knowledge graph over Natural Language Processing literature.
The proposed system is agnostic to the subject domain and can be applied for building a knowledge graph for any area.
Our system achieved F1 score of 0.38, 0.63 and 0.76 in end-to-end pipeline testing, phrase extraction testing and triplet extraction testing respectively.
arXiv Detail & Related papers (2021-04-04T14:33:21Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - Evaluation Toolkit For Robustness Testing Of Automatic Essay Scoring
Systems [64.4896118325552]
We evaluate the current state-of-the-art AES models using a model adversarial evaluation scheme and associated metrics.
We find that AES models are highly overstable. Even heavy modifications(as much as 25%) with content unrelated to the topic of the questions do not decrease the score produced by the models.
arXiv Detail & Related papers (2020-07-14T03:49:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.