Related papers: SciClaimHunt: A Large Dataset for Evidence-based Scientific Claim Verification

SciClaimHunt: A Large Dataset for Evidence-based Scientific Claim Verification

URL: http://arxiv.org/abs/2502.10003v1
Date: Fri, 14 Feb 2025 08:34:26 GMT
Title: SciClaimHunt: A Large Dataset for Evidence-based Scientific Claim Verification
Authors: Sujit Kumar, Anshul Sharma, Siddharth Hemant Khincha, Gargi Shroff, Sanasam Ranbir Singh, Rahul Mishra,
Abstract summary: We introduce two large-scale datasets, SciClaimHunt and SciClaimHunt_Num, derived from scientific research papers.<n>We propose several baseline models tailored for scientific claim verification to assess the effectiveness of these datasets.<n>We evaluate models trained on SciClaimHunt and SciClaimHunt_Num against existing scientific claim verification datasets to gauge their quality and reliability.
Score: 7.421845364041002
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Verifying scientific claims presents a significantly greater challenge than verifying political or news-related claims. Unlike the relatively broad audience for political claims, the users of scientific claim verification systems can vary widely, ranging from researchers testing specific hypotheses to everyday users seeking information on a medication. Additionally, the evidence for scientific claims is often highly complex, involving technical terminology and intricate domain-specific concepts that require specialized models for accurate verification. Despite considerable interest from the research community, there is a noticeable lack of large-scale scientific claim verification datasets to benchmark and train effective models. To bridge this gap, we introduce two large-scale datasets, SciClaimHunt and SciClaimHunt_Num, derived from scientific research papers. We propose several baseline models tailored for scientific claim verification to assess the effectiveness of these datasets. Additionally, we evaluate models trained on SciClaimHunt and SciClaimHunt_Num against existing scientific claim verification datasets to gauge their quality and reliability. Furthermore, we conduct human evaluations of the claims in proposed datasets and perform error analysis to assess the effectiveness of the proposed baseline models. Our findings indicate that SciClaimHunt and SciClaimHunt_Num serve as highly reliable resources for training models in scientific claim verification.

Related papers

MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning [24.72798058808192]
We present TextbookReasoning, an open dataset featuring truthful reference answers extracted from 12k university-level textbooks.<n>We introduce MegaScience, a large-scale mixture of high-quality open-source datasets totaling 1.25 million instances.<n>Our experiments demonstrate that our datasets achieve superior performance and training efficiency with more concise response lengths.
arXiv Detail & Related papers (2025-07-22T17:59:03Z)
Atomic Reasoning for Scientific Table Claim Verification [83.14588611859826]
Non-experts are susceptible to misleading claims based on scientific tables due to their high information density and perceived credibility.<n>Existing table claim verification models, including state-of-the-art large language models (LLMs), often struggle with precise fine-grained reasoning.<n>Inspired by Cognitive Load Theory, we propose that enhancing a model's ability to interpret table-based claims involves reducing cognitive load.
arXiv Detail & Related papers (2025-06-08T02:46:22Z)
Matter-of-Fact: A Benchmark for Verifying the Feasibility of Literature-Supported Claims in Materials Science [1.7113423851651721]
We introduce Matter-of-Fact, a challenge dataset for determining the feasibility of hypotheses framed as claims.<n>We show that strong baselines that include retrieval augmented generation over scientific literature and code generation fail to exceed 72% performance.
arXiv Detail & Related papers (2025-06-04T19:43:18Z)
Smoke and Mirrors in Causal Downstream Tasks [59.90654397037007]
This paper looks at the causal inference task of treatment effect estimation, where the outcome of interest is recorded in high-dimensional observations. We compare 6 480 models fine-tuned from state-of-the-art visual backbones, and find that the sampling and modeling choices significantly affect the accuracy of the causal estimate. Our results suggest that future benchmarks should carefully consider real downstream scientific questions, especially causal ones.
arXiv Detail & Related papers (2024-05-27T13:26:34Z)
Evaluating the Effectiveness of Retrieval-Augmented Large Language Models in Scientific Document Reasoning [0.0]
Large Language Model (LLM) often provide seemingly plausible but not factual information, often referred to as hallucinations. Retrieval-augmented LLMs provide a non-parametric approach to solve these issues by retrieving relevant information from external data sources. We critically evaluate these models in their ability to perform in scientific document reasoning tasks.
arXiv Detail & Related papers (2023-11-07T21:09:57Z)
SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables [68.76415918462418]
We present SCITAB, a challenging evaluation dataset consisting of 1.2K expert-verified scientific claims. Through extensive evaluations, we demonstrate that SCITAB poses a significant challenge to state-of-the-art models. Our analysis uncovers several unique challenges posed by SCITAB, including table grounding, claim ambiguity, and compositional reasoning.
arXiv Detail & Related papers (2023-05-22T16:13:50Z)
SciFact-Open: Towards open-domain scientific claim verification [61.288725621156864]
We present SciFact-Open, a new test collection designed to evaluate the performance of scientific claim verification systems. We collect evidence for scientific claims by pooling and annotating the top predictions of four state-of-the-art scientific claim verification models. We find that systems developed on smaller corpora struggle to generalize to SciFact-Open, exhibiting performance drops of at least 15 F1.
arXiv Detail & Related papers (2022-10-25T05:45:00Z)
Modeling Information Change in Science Communication with Semantically Matched Paraphrases [50.67030449927206]
SPICED is the first paraphrase dataset of scientific findings annotated for degree of information change. SPICED contains 6,000 scientific finding pairs extracted from news stories, social media discussions, and full texts of original papers. Models trained on SPICED improve downstream performance on evidence retrieval for fact checking of real-world scientific claims.
arXiv Detail & Related papers (2022-10-24T07:44:38Z)
Generating Scientific Claims for Zero-Shot Scientific Fact Checking [54.62086027306609]
Automated scientific fact checking is difficult due to the complexity of scientific language and a lack of significant amounts of training data. We propose scientific claim generation, the task of generating one or more atomic and verifiable claims from scientific sentences. We also demonstrate its usefulness in zero-shot fact checking for biomedical claims.
arXiv Detail & Related papers (2022-03-24T11:29:20Z)
RerrFact: Reduced Evidence Retrieval Representations for Scientific Claim Verification [4.052777228128475]
We propose a modular approach that sequentially carries out binary classification for every prediction subtask. We carry out two-step stance predictions that first differentiate non-relevant rationales and then identify supporting or refuting rationales for a given claim. Experimentally, our system RerrFact with no fine-tuning, simple design, and a fraction of model parameters fairs competitively on the leaderboard.
arXiv Detail & Related papers (2022-02-05T21:52:45Z)
SciClops: Detecting and Contextualizing Scientific Claims for Assisting Manual Fact-Checking [7.507186058512835]
This paper describes SciClops, a method to help combat online scientific misinformation. SciClops involves three main steps to process scientific claims found in online news articles and social media postings. It effectively assists non-expert fact-checkers in the verification of complex scientific claims, outperforming commercial fact-checking systems.
arXiv Detail & Related papers (2021-10-25T16:35:58Z)
Fact or Fiction: Verifying Scientific Claims [53.29101835904273]
We introduce scientific claim verification, a new task to select abstracts from the research literature containing evidence that SUPPORTS or REFUTES a given scientific claim. We construct SciFact, a dataset of 1.4K expert-written scientific claims paired with evidence-containing abstracts annotated with labels and rationales. We show that our system is able to verify claims related to COVID-19 by identifying evidence from the CORD-19 corpus.
arXiv Detail & Related papers (2020-04-30T17:22:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.