Related papers: CAIRNS: Balancing Readability and Scientific Accuracy in Climate Adaptation Question Answering

CAIRNS: Balancing Readability and Scientific Accuracy in Climate Adaptation Question Answering

URL: http://arxiv.org/abs/2512.02251v1
Date: Mon, 01 Dec 2025 22:44:43 GMT
Title: CAIRNS: Balancing Readability and Scientific Accuracy in Climate Adaptation Question Answering
Authors: Liangji Kong, Aditya Joshi, Sarvnaz Karimi,
Abstract summary: We present Climate Adaptation question-answering with Improved Readability and Noted Sources (CAIRNS)<n>CAIRNS is a framework that enables experts to obtain credible preliminary answers from complex evidence sources from the web.<n>It enhances readability and citation reliability through a structured ScholarGuide prompt and achieves robust evaluation.
Score: 10.31170458584116
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Climate adaptation strategies are proposed in response to climate change. They are practised in agriculture to sustain food production. These strategies can be found in unstructured data (for example, scientific literature from the Elsevier website) or structured (heterogeneous climate data via government APIs). We present Climate Adaptation question-answering with Improved Readability and Noted Sources (CAIRNS), a framework that enables experts -- farmer advisors -- to obtain credible preliminary answers from complex evidence sources from the web. It enhances readability and citation reliability through a structured ScholarGuide prompt and achieves robust evaluation via a consistency-weighted hybrid evaluator that leverages inter-model agreement with experts. Together, these components enable readable, verifiable, and domain-grounded question-answering without fine-tuning or reinforcement learning. Using a previously reported dataset of expert-curated question-answers, we show that CAIRNS outperforms the baselines on most of the metrics. Our thorough ablation study confirms the results on all metrics. To validate our LLM-based evaluation, we also report an analysis of correlations against human judgment.

Related papers

Epistemic Context Learning: Building Trust the Right Way in LLM-Based Multi-Agent Systems [94.9141394384021]
Individual agents in multi-agent systems often lack robustness, tending to blindly conform to misleading peers.<n>We show this weakness stems from both sycophancy and inadequate ability to evaluate peer reliability.<n>We first formalize the learning problem of history-aware reference, introducing the historical interactions of peers as additional input.<n>We then develop Epistemic Context Learning (ECL), a reasoning framework that conditions predictions on explicitly-built peer profiles from history.
arXiv Detail & Related papers (2026-01-29T13:59:32Z)
SciRAG: Adaptive, Citation-Aware, and Outline-Guided Retrieval and Synthesis for Scientific Literature [52.36039386997026]
We introduce SciRAG, an open-source framework for scientific literature exploration.<n>We introduce three key innovations: (1) adaptive retrieval that flexibly alternates between sequential and parallel evidence gathering; (2) citation-aware symbolic reasoning that leverages citation graphs to organize and filter documents; and (3) outline-guided synthesis that plans, critiques, and refines answers to ensure coherence and transparent attribution.
arXiv Detail & Related papers (2025-11-18T11:09:19Z)
CLINB: A Climate Intelligence Benchmark for Foundational Models [31.884362929125363]
We introduce CLINB, a benchmark that assesses models on open-ended, grounded, multimodal question answering tasks.<n>We implement and validate a model-based evaluation process and evaluate several frontier models.
arXiv Detail & Related papers (2025-10-29T16:15:42Z)
LiRA: A Multi-Agent Framework for Reliable and Readable Literature Review Generation [66.09346158850308]
We present LiRA (Literature Review Agents), a multi-agent collaborative workflow which emulates the human literature review process.<n>LiRA utilizes specialized agents for content outlining, subsection writing, editing, and reviewing, producing cohesive and comprehensive review articles.<n>We evaluate LiRA in real-world scenarios using document retrieval and assess its robustness to reviewer model variation.
arXiv Detail & Related papers (2025-10-01T12:14:28Z)
ClimateBench-M: A Multi-Modal Climate Data Benchmark with a Simple Generative Method [61.76389719956301]
We contribute a multi-modal climate benchmark, i.e., ClimateBench-M, which aligns time series climate data from ERA5, extreme weather events data from NOAA, and satellite image data from NASA.<n>Under each data modality, we also propose a simple but strong generative method that could produce competitive performance in weather forecasting, thunderstorm alerts, and crop segmentation tasks.
arXiv Detail & Related papers (2025-04-10T02:22:23Z)
ClimaQA: An Automated Evaluation Framework for Climate Question Answering Models [38.05357439484919]
We develop ClimaGen, an adaptive learning framework that generates question-answer pairs from graduate textbooks with climate scientists in the loop.<n>We present ClimaQA-Gold, an expert-annotated benchmark dataset alongside ClimaQA-Silver, a large-scale, comprehensive synthetic QA dataset for climate science.
arXiv Detail & Related papers (2024-10-22T05:12:19Z)
Contrastive Learning to Improve Retrieval for Real-world Fact Checking [84.57583869042791]
We present Contrastive Fact-Checking Reranker (CFR), an improved retriever for fact-checking complex claims. We leverage the AVeriTeC dataset, which annotates subquestions for claims with human written answers from evidence documents. We find a 6% improvement in veracity classification accuracy on the dataset.
arXiv Detail & Related papers (2024-10-07T00:09:50Z)
Automated Fact-Checking of Climate Change Claims with Large Language Models [3.1080484250243425]
This paper presents Climinator, a novel AI-based tool designed to automate the fact-checking of climate change claims. Climinator employs an innovative Mediator-Advocate framework to synthesize varying scientific perspectives. Our model demonstrates remarkable accuracy when testing claims collected from Climate Feedback and Skeptical Science.
arXiv Detail & Related papers (2024-01-23T08:49:23Z)
ClimateX: Do LLMs Accurately Assess Human Expert Confidence in Climate Statements? [0.0]
We introduce the Expert Confidence in Climate Statements (ClimateX) dataset, a novel, curated, expert-labeled dataset consisting of 8094 climate statements. Using this dataset, we show that recent Large Language Models (LLMs) can classify human expert confidence in climate-related statements. Overall, models exhibit consistent and significant over-confidence on low and medium confidence statements.
arXiv Detail & Related papers (2023-11-28T10:26:57Z)
Federated Prompt Learning for Weather Foundation Models on Devices [37.88417074427373]
On-device intelligence for weather forecasting uses local deep learning models to analyze weather patterns without centralized cloud computing. This paper propose Federated Prompt Learning for Weather Foundation Models on Devices (FedPoD) FedPoD enables devices to obtain highly customized models while maintaining communication efficiency.
arXiv Detail & Related papers (2023-05-23T16:59:20Z)
Towards Answering Climate Questionnaires from Unstructured Climate Reports [26.036105166376284]
Activists and policymakers need NLP tools to process the vast and rapidly growing unstructured textual climate reports into structured form. We introduce two new large-scale climate questionnaire datasets and use their existing structure to train self-supervised models. We then use these models to help align texts from unstructured climate documents to the semi-structured questionnaires in a human pilot study.
arXiv Detail & Related papers (2023-01-11T00:22:56Z)
Analyzing Sustainability Reports Using Natural Language Processing [68.8204255655161]
In recent years, companies have increasingly been aiming to both mitigate their environmental impact and adapt to the changing climate context. This is reported via increasingly exhaustive reports, which cover many types of climate risks and exposures under the umbrella of Environmental, Social, and Governance (ESG) We present this tool and the methodology that we used to develop it in the present article.
arXiv Detail & Related papers (2020-11-03T21:22:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.