UnScientify: Detecting Scientific Uncertainty in Scholarly Full Text
- URL: http://arxiv.org/abs/2307.14236v1
- Date: Wed, 26 Jul 2023 15:04:24 GMT
- Title: UnScientify: Detecting Scientific Uncertainty in Scholarly Full Text
- Authors: Panggih Kusuma Ningrum, Philipp Mayr, Iana Atanassova
- Abstract summary: UnScientify is an interactive system designed to detect scientific uncertainty in scholarly full text.
The pipeline for the system includes a combination of pattern matching, complex sentence checking, and authorial reference checking.
UnScientify provides interpretable results, aiding in the comprehension of identified instances of scientific uncertainty in text.
- Score: 5.318135784473086
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This demo paper presents UnScientify, an interactive system designed to
detect scientific uncertainty in scholarly full text. The system utilizes a
weakly supervised technique that employs a fine-grained annotation scheme to
identify verbally formulated uncertainty at the sentence level in scientific
texts. The pipeline for the system includes a combination of pattern matching,
complex sentence checking, and authorial reference checking. Our approach
automates labeling and annotation tasks for scientific uncertainty
identification, taking into account different types of scientific uncertainty,
that can serve various applications such as information retrieval, text mining,
and scholarly document processing. Additionally, UnScientify provides
interpretable results, aiding in the comprehension of identified instances of
scientific uncertainty in text.
Related papers
- Annotating Scientific Uncertainty: A comprehensive model using linguistic patterns and comparison with existing approaches [1.9627519910539217]
UnScientify is a system designed to detect scientific uncertainty in scholarly full text.
The core methodology of UnScientify is based on a multi-faceted pipeline that integrates span pattern matching, complex sentence analysis and author reference checking.
The evaluation results highlight the trade-offs between modern large language models (LLMs) and the UnScientify system.
arXiv Detail & Related papers (2025-03-14T13:21:59Z) - Detecting Entities in the Astrophysics Literature: A Comparison of
Word-based and Span-based Entity Recognition Methods [20.506920012146235]
We describe our entity recognition methods developed as part of the DEAL (Detecting Entities in the Astrophysics Literature) shared task.
The aim of the task is to build a system that can identify Named Entities in a dataset composed by scholarly articles from astrophysics literature.
arXiv Detail & Related papers (2022-11-24T23:07:48Z) - SciFact-Open: Towards open-domain scientific claim verification [61.288725621156864]
We present SciFact-Open, a new test collection designed to evaluate the performance of scientific claim verification systems.
We collect evidence for scientific claims by pooling and annotating the top predictions of four state-of-the-art scientific claim verification models.
We find that systems developed on smaller corpora struggle to generalize to SciFact-Open, exhibiting performance drops of at least 15 F1.
arXiv Detail & Related papers (2022-10-25T05:45:00Z) - Textual Entailment Recognition with Semantic Features from Empirical
Text Representation [60.31047947815282]
A text entails a hypothesis if and only if the true value of the hypothesis follows the text.
In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis.
We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair.
arXiv Detail & Related papers (2022-10-18T10:03:51Z) - An Informational Space Based Semantic Analysis for Scientific Texts [62.997667081978825]
This paper introduces computational methods for semantic analysis and the quantifying the meaning of short scientific texts.
The representation of scientific-specific meaning is standardised by replacing the situation representations, rather than psychological properties.
The research in this paper conducts the base for the geometric representation of the meaning of texts.
arXiv Detail & Related papers (2022-05-31T11:19:32Z) - The Familiarity Hypothesis: Explaining the Behavior of Deep Open Set
Methods [86.39044549664189]
Anomaly detection algorithms for feature-vector data identify anomalies as outliers, but outlier detection has not worked well in deep learning.
This paper proposes the Familiarity Hypothesis that these methods succeed because they are detecting the absence of familiar learned features rather than the presence of novelty.
The paper concludes with a discussion of whether familiarity detection is an inevitable consequence of representation learning.
arXiv Detail & Related papers (2022-03-04T18:32:58Z) - SciClops: Detecting and Contextualizing Scientific Claims for Assisting
Manual Fact-Checking [7.507186058512835]
This paper describes SciClops, a method to help combat online scientific misinformation.
SciClops involves three main steps to process scientific claims found in online news articles and social media postings.
It effectively assists non-expert fact-checkers in the verification of complex scientific claims, outperforming commercial fact-checking systems.
arXiv Detail & Related papers (2021-10-25T16:35:58Z) - Expressing High-Level Scientific Claims with Formal Semantics [0.8258451067861932]
We analyze the main claims from a sample of scientific articles from all disciplines.
We find that their semantics are more complex than what a straight-forward application of formalisms like RDF or OWL account for.
We show here how the instantiation of the five slots of this super-pattern leads to a strictly defined statement in higher-order logic.
arXiv Detail & Related papers (2021-09-27T09:52:49Z) - Analyzing Non-Textual Content Elements to Detect Academic Plagiarism [0.8490310884703459]
The thesis proposes plagiarism detection approaches that implement a different concept: analyzing non-textual content in academic documents.
To demonstrate the benefit of combining non-textual and text-based detection methods, the thesis describes the first plagiarism detection system that integrates the analysis of citation-based, image-based, math-based, and text-based document similarity.
arXiv Detail & Related papers (2021-06-10T14:11:52Z) - CitationIE: Leveraging the Citation Graph for Scientific Information
Extraction [89.33938657493765]
We use the citation graph of referential links between citing and cited papers.
We observe a sizable improvement in end-to-end information extraction over the state-of-the-art.
arXiv Detail & Related papers (2021-06-03T03:00:12Z) - Semantic Analysis for Automated Evaluation of the Potential Impact of
Research Articles [62.997667081978825]
This paper presents a novel method for vector representation of text meaning based on information theory.
We show how this informational semantics is used for text classification on the basis of the Leicester Scientific Corpus.
We show that an informational approach to representing the meaning of a text has offered a way to effectively predict the scientific impact of research papers.
arXiv Detail & Related papers (2021-04-26T20:37:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.