Automatic evaluation of scientific abstracts through natural language
processing
- URL: http://arxiv.org/abs/2112.01842v1
- Date: Sun, 14 Nov 2021 12:55:29 GMT
- Title: Automatic evaluation of scientific abstracts through natural language
processing
- Authors: Lucas G. O. Lopes, Thales M. A. Vieira, and William W. M. Lira
- Abstract summary: This paper proposes natural language processing algorithms to classify, segment and evaluate scientific work.
The proposed framework categorizes the abstract texts into according to the problems intended to be solved by employing a text classification approach.
The methodology of the abstract is ranked based on the sentiment analysis of its results.
- Score: 0.0
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: This work presents a framework to classify and evaluate distinct research
abstract texts which are focused on the description of processes and their
applications. In this context, this paper proposes natural language processing
algorithms to classify, segment and evaluate the results of scientific work.
Initially, the proposed framework categorize the abstract texts into according
to the problems intended to be solved by employing a text classification
approach. Then, the abstract text is segmented into problem description,
methodology and results. Finally, the methodology of the abstract is ranked
based on the sentiment analysis of its results. The proposed framework allows
us to quickly rank the best methods to solve specific problems. To validate the
proposed framework, oil production anomaly abstracts were experimented and
achieved promising results.
Related papers
- Detecting Statements in Text: A Domain-Agnostic Few-Shot Solution [1.3654846342364308]
State-of-the-art approaches usually involve fine-tuning models on large annotated datasets, which are costly to produce.
We propose and release a qualitative and versatile few-shot learning methodology as a common paradigm for any claim-based textual classification task.
We illustrate this methodology in the context of three tasks: climate change contrarianism detection, topic/stance classification and depression-relates symptoms detection.
arXiv Detail & Related papers (2024-05-09T12:03:38Z) - Factually Consistent Summarization via Reinforcement Learning with
Textual Entailment Feedback [57.816210168909286]
We leverage recent progress on textual entailment models to address this problem for abstractive summarization systems.
We use reinforcement learning with reference-free, textual entailment rewards to optimize for factual consistency.
Our results, according to both automatic metrics and human evaluation, show that our method considerably improves the faithfulness, salience, and conciseness of the generated summaries.
arXiv Detail & Related papers (2023-05-31T21:04:04Z) - Detecting automatically the layout of clinical documents to enhance the
performances of downstream natural language processing [53.797797404164946]
We designed an algorithm to process clinical PDF documents and extract only clinically relevant text.
The algorithm consists of several steps: initial text extraction using a PDF, followed by classification into such categories as body text, left notes, and footers.
Medical performance was evaluated by examining the extraction of medical concepts of interest from the text in their respective sections.
arXiv Detail & Related papers (2023-05-23T08:38:33Z) - Unsupervised Scientific Abstract Segmentation with Normalized Mutual
Information [4.129225533930966]
We empirically explore using Normalized Mutual Information (NMI) for abstract segmentation.
On non-structured abstracts, our proposed unsupervised approach GreedyCAS achieves the best performance across all evaluation metrics.
The strong correlation of NMI to our evaluation metrics reveals the effectiveness of NMI for abstract segmentation.
arXiv Detail & Related papers (2023-05-19T09:53:45Z) - Lay Text Summarisation Using Natural Language Processing: A Narrative
Literature Review [1.8899300124593648]
The aim of this literature review is to describe and compare the different text summarisation approaches used to generate lay summaries.
We screened 82 articles and included eight relevant papers published between 2020 and 2021, using the same dataset.
A combination of extractive and abstractive summarisation methods in a hybrid approach was found to be most effective.
arXiv Detail & Related papers (2023-03-24T18:30:50Z) - The Factual Inconsistency Problem in Abstractive Text Summarization: A
Survey [25.59111855107199]
neural encoder-decoder models pioneered by Seq2Seq framework have been proposed to achieve the goal of generating more abstractive summaries.
At a high level, such neural models can freely generate summaries without any constraint on the words or phrases used.
However, the neural model's abstraction ability is a double-edged sword.
arXiv Detail & Related papers (2021-04-30T08:46:13Z) - Semantic Analysis for Automated Evaluation of the Potential Impact of
Research Articles [62.997667081978825]
This paper presents a novel method for vector representation of text meaning based on information theory.
We show how this informational semantics is used for text classification on the basis of the Leicester Scientific Corpus.
We show that an informational approach to representing the meaning of a text has offered a way to effectively predict the scientific impact of research papers.
arXiv Detail & Related papers (2021-04-26T20:37:13Z) - Topic-Centric Unsupervised Multi-Document Summarization of Scientific
and News Articles [3.0504782036247438]
We propose a topic-centric unsupervised multi-document summarization framework to generate abstractive summaries.
The proposed algorithm generates an abstractive summary by developing salient language unit selection and text generation techniques.
Our approach matches the state-of-the-art when evaluated on automated extractive evaluation metrics and performs better for abstractive summarization on five human evaluation metrics.
arXiv Detail & Related papers (2020-11-03T04:04:21Z) - Multi-Fact Correction in Abstractive Text Summarization [98.27031108197944]
Span-Fact is a suite of two factual correction models that leverages knowledge learned from question answering models to make corrections in system-generated summaries via span selection.
Our models employ single or multi-masking strategies to either iteratively or auto-regressively replace entities in order to ensure semantic consistency w.r.t. the source text.
Experiments show that our models significantly boost the factual consistency of system-generated summaries without sacrificing summary quality in terms of both automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-10-06T02:51:02Z) - A Survey on Text Classification: From Shallow to Deep Learning [83.47804123133719]
The last decade has seen a surge of research in this area due to the unprecedented success of deep learning.
This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021.
We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification.
arXiv Detail & Related papers (2020-08-02T00:09:03Z) - A computational model implementing subjectivity with the 'Room Theory'.
The case of detecting Emotion from Text [68.8204255655161]
This work introduces a new method to consider subjectivity and general context dependency in text analysis.
By using similarity measure between words, we are able to extract the relative relevance of the elements in the benchmark.
This method could be applied to all the cases where evaluating subjectivity is relevant to understand the relative value or meaning of a text.
arXiv Detail & Related papers (2020-05-12T21:26:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.