Related papers: Scientific Claim Verification with VERT5ERINI

Scientific Claim Verification with VERT5ERINI

URL: http://arxiv.org/abs/2010.11930v1
Date: Thu, 22 Oct 2020 17:56:33 GMT
Title: Scientific Claim Verification with VERT5ERINI
Authors: Ronak Pradeep, Xueguang Ma, Rodrigo Nogueira, Jimmy Lin
Abstract summary: This work describes the adaptation of a pretrained sequence-to-sequence model to the task of scientific claim verification in the biomedical domain. We propose VERT5ERINI that exploits T5 for abstract retrieval, sentence selection and label prediction. We evaluate our pipeline on SCIFACT, a newly curated dataset that requires models to not just predict the veracity of claims but also provide relevant sentences from a corpus of scientific literature that support this decision.
Score: 57.103189505636614
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This work describes the adaptation of a pretrained sequence-to-sequence model to the task of scientific claim verification in the biomedical domain. We propose VERT5ERINI that exploits T5 for abstract retrieval, sentence selection and label prediction, which are three critical sub-tasks of claim verification. We evaluate our pipeline on SCIFACT, a newly curated dataset that requires models to not just predict the veracity of claims but also provide relevant sentences from a corpus of scientific literature that support this decision. Empirically, our pipeline outperforms a strong baseline in each of the three steps. Finally, we show VERT5ERINI's ability to generalize to two new datasets of COVID-19 claims using evidence from the ever-expanding CORD-19 corpus.

Related papers

Contrastive Learning to Improve Retrieval for Real-world Fact Checking [84.57583869042791]
We present Contrastive Fact-Checking Reranker (CFR), an improved retriever for fact-checking complex claims. We leverage the AVeriTeC dataset, which annotates subquestions for claims with human written answers from evidence documents. We find a 6% improvement in veracity classification accuracy on the dataset.
arXiv Detail & Related papers (2024-10-07T00:09:50Z)
WiCE: Real-World Entailment for Claims in Wikipedia [63.234352061821625]
We propose WiCE, a new fine-grained textual entailment dataset built on natural claim and evidence pairs extracted from Wikipedia. In addition to standard claim-level entailment, WiCE provides entailment judgments over sub-sentence units of the claim. We show that real claims in our dataset involve challenging verification and retrieval problems that existing models fail to address.
arXiv Detail & Related papers (2023-03-02T17:45:32Z)
BLIAM: Literature-based Data Synthesis for Synergistic Drug Combination Prediction [13.361489059744754]
BLIAM generates training data points that are interpretable and model-agnostic to downstream applications. BLIAM can be further used to synthesize data points for novel drugs and cell lines that were not even measured in biomedical experiments.
arXiv Detail & Related papers (2023-02-14T06:48:52Z)
GERE: Generative Evidence Retrieval for Fact Verification [57.78768817972026]
We propose GERE, the first system that retrieves evidences in a generative fashion. The experimental results on the FEVER dataset show that GERE achieves significant improvements over the state-of-the-art baselines.
arXiv Detail & Related papers (2022-04-12T03:49:35Z)
Abstract, Rationale, Stance: A Joint Model for Scientific Claim Verification [18.330265729989843]
We propose an approach, named as ARSJoint, that jointly learns the modules for the three tasks with a machine reading comprehension framework. The experimental results on the benchmark dataset SciFact show that our approach outperforms the existing works.
arXiv Detail & Related papers (2021-09-13T10:07:26Z)
An Interpretable End-to-end Fine-tuning Approach for Long Clinical Text [72.62848911347466]
Unstructured clinical text in EHRs contains crucial information for applications including decision support, trial matching, and retrospective research. Recent work has applied BERT-based models to clinical information extraction and text classification, given these models' state-of-the-art performance in other NLP domains. In this work, we propose a novel fine-tuning approach called SnipBERT. Instead of using entire notes, SnipBERT identifies crucial snippets and feeds them into a truncated BERT-based model in a hierarchical manner.
arXiv Detail & Related papers (2020-11-12T17:14:32Z)
Text Mining to Identify and Extract Novel Disease Treatments From Unstructured Datasets [56.38623317907416]
We use Google Cloud to transcribe podcast episodes of an NPR radio show. We then build a pipeline for systematically pre-processing the text. Our model successfully identified that Omeprazole can help treat heartburn.
arXiv Detail & Related papers (2020-10-22T19:52:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.