Automated Fact-Checking of Climate Change Claims with Large Language
Models
- URL: http://arxiv.org/abs/2401.12566v1
- Date: Tue, 23 Jan 2024 08:49:23 GMT
- Title: Automated Fact-Checking of Climate Change Claims with Large Language
Models
- Authors: Markus Leippold and Saeid Ashraf Vaghefi and Dominik Stammbach and
Veruska Muccione and Julia Bingler and Jingwei Ni and Chiara Colesanti-Senni
and Tobias Wekhof and Tobias Schimanski and Glen Gostlow and Tingyu Yu and
Juerg Luterbacher and Christian Huggel
- Abstract summary: This paper presents Climinator, a novel AI-based tool designed to automate the fact-checking of climate change claims.
Climinator employs an innovative Mediator-Advocate framework to synthesize varying scientific perspectives.
Our model demonstrates remarkable accuracy when testing claims collected from Climate Feedback and Skeptical Science.
- Score: 3.1080484250243425
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents Climinator, a novel AI-based tool designed to automate
the fact-checking of climate change claims. Utilizing an array of Large
Language Models (LLMs) informed by authoritative sources like the IPCC reports
and peer-reviewed scientific literature, Climinator employs an innovative
Mediator-Advocate framework. This design allows Climinator to effectively
synthesize varying scientific perspectives, leading to robust, evidence-based
evaluations. Our model demonstrates remarkable accuracy when testing claims
collected from Climate Feedback and Skeptical Science. Notably, when
integrating an advocate with a climate science denial perspective in our
framework, Climinator's iterative debate process reliably converges towards
scientific consensus, underscoring its adeptness at reconciling diverse
viewpoints into science-based, factual conclusions. While our research is
subject to certain limitations and necessitates careful interpretation, our
approach holds significant potential. We hope to stimulate further research and
encourage exploring its applicability in other contexts, including political
fact-checking and legal domains.
Related papers
- Evaluating Large Language Models in Scientific Discovery [91.732562776782]
Large language models (LLMs) are increasingly applied to scientific research, yet prevailing science benchmarks probe decontextualized knowledge.<n>We introduce a scenario-grounded benchmark that evaluates LLMs across biology, chemistry, materials, and physics.<n>The framework assesses models at two levels: (i) question-level accuracy on scenario-tied items and (ii) project-level performance.
arXiv Detail & Related papers (2025-12-17T16:20:03Z) - CAIRNS: Balancing Readability and Scientific Accuracy in Climate Adaptation Question Answering [10.31170458584116]
We present Climate Adaptation question-answering with Improved Readability and Noted Sources (CAIRNS)<n>CAIRNS is a framework that enables experts to obtain credible preliminary answers from complex evidence sources from the web.<n>It enhances readability and citation reliability through a structured ScholarGuide prompt and achieves robust evaluation.
arXiv Detail & Related papers (2025-12-01T22:44:43Z) - Autonomous Agents for Scientific Discovery: Orchestrating Scientists, Language, Code, and Physics [82.55776608452017]
Large language models (LLMs) provide a flexible and versatile framework that orchestrates interactions with human scientists, natural language, computer language and code, and physics.<n>This paper presents our view and vision of LLM-based scientific agents and their growing role in transforming the scientific discovery lifecycle.<n>We identify open research challenges and outline promising directions for building more robust, generalizable, and adaptive scientific agents.
arXiv Detail & Related papers (2025-10-10T22:26:26Z) - A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers [251.23085679210206]
Scientific Large Language Models (Sci-LLMs) are transforming how knowledge is represented, integrated, and applied in scientific research.<n>This survey reframes the development of Sci-LLMs as a co-evolution between models and their underlying data substrate.<n>We formulate a unified taxonomy of scientific data and a hierarchical model of scientific knowledge.
arXiv Detail & Related papers (2025-08-28T18:30:52Z) - Dynamic Knowledge Exchange and Dual-diversity Review: Concisely Unleashing the Potential of a Multi-Agent Research Team [53.38438460574943]
IDVSCI is a multi-agent framework built on large language models (LLMs)<n>It incorporates two key innovations: a Dynamic Knowledge Exchange mechanism and a Dual-Diversity Review paradigm.<n>Results show that IDVSCI consistently achieves the best performance across two datasets.
arXiv Detail & Related papers (2025-06-23T07:12:08Z) - Bayesian Epistemology with Weighted Authority: A Formal Architecture for Truth-Promoting Autonomous Scientific Reasoning [0.0]
This paper introduces Bayesian Epistemology with Weighted Authority (BEWA)<n>BEWA operationalises belief as a dynamic, probabilistically coherent function over structured scientific claims.<n>It supports graph-based claim propagation, authorial credibility modelling, cryptographic anchoring, and zero-knowledge audit verification.
arXiv Detail & Related papers (2025-06-19T04:22:35Z) - Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation [58.064940977804596]
A plethora of new AI models and tools has been proposed, promising to empower researchers and academics worldwide to conduct their research more effectively and efficiently.
Ethical concerns regarding shortcomings of these tools and potential for misuse take a particularly prominent place in our discussion.
arXiv Detail & Related papers (2025-02-07T18:26:45Z) - Towards unearthing neglected climate innovations from scientific literature using Large Language Models [0.0]
This study employs a curated dataset sourced from OpenAlex, a comprehensive repository of scientific papers.
We evaluate title-abstract pairs from scientific papers on seven dimensions, covering climate change mitigation potential, stage of technological development, and readiness for deployment.
The outputs of the language models are then compared with human evaluations to assess their effectiveness in identifying promising yet overlooked climate innovations.
arXiv Detail & Related papers (2024-11-15T09:17:40Z) - LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery [141.39722070734737]
We propose to enhance the knowledge-driven, abstract reasoning abilities of Large Language Models with the computational strength of simulations.
We introduce Scientific Generative Agent (SGA), a bilevel optimization framework.
We conduct experiments to demonstrate our framework's efficacy in law discovery and molecular design.
arXiv Detail & Related papers (2024-05-16T03:04:10Z) - ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models [56.08917291606421]
ResearchAgent is a large language model-powered research idea writing agent.
It generates problems, methods, and experiment designs while iteratively refining them based on scientific literature.
We experimentally validate our ResearchAgent on scientific publications across multiple disciplines.
arXiv Detail & Related papers (2024-04-11T13:36:29Z) - Using Natural Language Processing and Networks to Automate Structured Literature Reviews: An Application to Farmers Climate Change Adaptation [0.0]
This work aims to sensibly use Natural Language Processing by extracting variables relations and synthesizing their findings using networks.
As an example, we apply our methodology to the analysis of farmers' adaptation to climate change.
Results show that the use of Natural Language Processing together with networks in a descriptive manner offers a fast and interpretable way to synthesize literature review findings.
arXiv Detail & Related papers (2023-06-16T10:05:47Z) - Scientific Opinion Summarization: Paper Meta-review Generation Dataset, Methods, and Evaluation [55.00687185394986]
We propose the task of scientific opinion summarization, where research paper reviews are synthesized into meta-reviews.
We introduce the ORSUM dataset covering 15,062 paper meta-reviews and 57,536 paper reviews from 47 conferences.
Our experiments show that (1) human-written summaries do not always satisfy all necessary criteria such as depth of discussion, and identifying consensus and controversy for the specific domain, and (2) the combination of task decomposition and iterative self-refinement shows strong potential for enhancing the opinions.
arXiv Detail & Related papers (2023-05-24T02:33:35Z) - Towards Answering Climate Questionnaires from Unstructured Climate
Reports [26.036105166376284]
Activists and policymakers need NLP tools to process the vast and rapidly growing unstructured textual climate reports into structured form.
We introduce two new large-scale climate questionnaire datasets and use their existing structure to train self-supervised models.
We then use these models to help align texts from unstructured climate documents to the semi-structured questionnaires in a human pilot study.
arXiv Detail & Related papers (2023-01-11T00:22:56Z) - SciFact-Open: Towards open-domain scientific claim verification [61.288725621156864]
We present SciFact-Open, a new test collection designed to evaluate the performance of scientific claim verification systems.
We collect evidence for scientific claims by pooling and annotating the top predictions of four state-of-the-art scientific claim verification models.
We find that systems developed on smaller corpora struggle to generalize to SciFact-Open, exhibiting performance drops of at least 15 F1.
arXiv Detail & Related papers (2022-10-25T05:45:00Z) - Modeling Information Change in Science Communication with Semantically
Matched Paraphrases [50.67030449927206]
SPICED is the first paraphrase dataset of scientific findings annotated for degree of information change.
SPICED contains 6,000 scientific finding pairs extracted from news stories, social media discussions, and full texts of original papers.
Models trained on SPICED improve downstream performance on evidence retrieval for fact checking of real-world scientific claims.
arXiv Detail & Related papers (2022-10-24T07:44:38Z) - Generating Scientific Claims for Zero-Shot Scientific Fact Checking [54.62086027306609]
Automated scientific fact checking is difficult due to the complexity of scientific language and a lack of significant amounts of training data.
We propose scientific claim generation, the task of generating one or more atomic and verifiable claims from scientific sentences.
We also demonstrate its usefulness in zero-shot fact checking for biomedical claims.
arXiv Detail & Related papers (2022-03-24T11:29:20Z) - CLIMATE-FEVER: A Dataset for Verification of Real-World Climate Claims [4.574830585715129]
We introduce CLIMATE-FEVER, a new dataset for verification of climate change-related claims.
We adapt the methodology of FEVER [1], the largest dataset of artificially designed claims, to real-life claims collected from the Internet.
We discuss the surprising, subtle complexity of modeling real-world climate-related claims within the textscfever framework.
arXiv Detail & Related papers (2020-12-01T16:32:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.