Climate Change from Large Language Models
- URL: http://arxiv.org/abs/2312.11985v3
- Date: Mon, 1 Jul 2024 10:40:19 GMT
- Title: Climate Change from Large Language Models
- Authors: Hongyin Zhu, Prayag Tiwari,
- Abstract summary: Climate change poses grave challenges, demanding widespread understanding and low-carbon lifestyle awareness.
Large language models (LLMs) offer a powerful tool to address this crisis.
This paper proposes an automated evaluation framework to assess climate-crisis knowledge.
- Score: 7.190384101545232
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Climate change poses grave challenges, demanding widespread understanding and low-carbon lifestyle awareness. Large language models (LLMs) offer a powerful tool to address this crisis, yet comprehensive evaluations of their climate-crisis knowledge are lacking. This paper proposes an automated evaluation framework to assess climate-crisis knowledge within LLMs. We adopt a hybrid approach for data acquisition, combining data synthesis and manual collection, to compile a diverse set of questions encompassing various aspects of climate change. Utilizing prompt engineering based on the compiled questions, we evaluate the model's knowledge by analyzing its generated answers. Furthermore, we introduce a comprehensive set of metrics to assess climate-crisis knowledge, encompassing indicators from 10 distinct perspectives. These metrics provide a multifaceted evaluation, enabling a nuanced understanding of the LLMs' climate crisis comprehension. The experimental results demonstrate the efficacy of our proposed method. In our evaluation utilizing diverse high-performing LLMs, we discovered that while LLMs possess considerable climate-related knowledge, there are shortcomings in terms of timeliness, indicating a need for continuous updating and refinement of their climate-related content.
Related papers
- AGENT-CQ: Automatic Generation and Evaluation of Clarifying Questions for Conversational Search with LLMs [53.6200736559742]
AGENT-CQ consists of two stages: a generation stage and an evaluation stage.
CrowdLLM simulates human crowdsourcing judgments to assess generated questions and answers.
Experiments on the ClariQ dataset demonstrate CrowdLLM's effectiveness in evaluating question and answer quality.
arXiv Detail & Related papers (2024-10-25T17:06:27Z) - ClimaQA: An Automated Evaluation Framework for Climate Foundation Models [38.05357439484919]
We develop ClimaGen, an automated framework that generates question-answer pairs from graduate textbooks with climate scientists in the loop.
We present ClimaQA-Gold, an expert-annotated benchmark dataset alongside ClimaQA-Silver, a large-scale, comprehensive synthetic QA dataset for climate science.
arXiv Detail & Related papers (2024-10-22T05:12:19Z) - Unlearning Climate Misinformation in Large Language Models [17.95497650321137]
Misinformation regarding climate change is a key roadblock in addressing one of the most serious threats to humanity.
This paper investigates factual accuracy in large language models (LLMs) regarding climate information.
arXiv Detail & Related papers (2024-05-29T23:11:53Z) - Comparing Data-Driven and Mechanistic Models for Predicting Phenology in
Deciduous Broadleaf Forests [47.285748922842444]
We train a deep neural network to predict a phenological index from meteorological time series.
We find that this approach outperforms traditional process-based models.
arXiv Detail & Related papers (2024-01-08T15:29:23Z) - A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication.
This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches.
We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z) - Arabic Mini-ClimateGPT : A Climate Change and Sustainability Tailored
Arabic LLM [77.17254959695218]
Large Language Models (LLMs) like ChatGPT and Bard have shown impressive conversational abilities and excel in a wide variety of NLP tasks.
We propose a light-weight Arabic Mini-ClimateGPT that is built on an open-source LLM and is specifically fine-tuned on a conversational-style instruction tuning Arabic dataset Clima500-Instruct.
Our model surpasses the baseline LLM in 88.3% of cases during ChatGPT-based evaluation.
arXiv Detail & Related papers (2023-12-14T22:04:07Z) - ClimateX: Do LLMs Accurately Assess Human Expert Confidence in Climate
Statements? [0.0]
We introduce the Expert Confidence in Climate Statements (ClimateX) dataset, a novel, curated, expert-labeled dataset consisting of 8094 climate statements.
Using this dataset, we show that recent Large Language Models (LLMs) can classify human expert confidence in climate-related statements.
Overall, models exhibit consistent and significant over-confidence on low and medium confidence statements.
arXiv Detail & Related papers (2023-11-28T10:26:57Z) - Assessing Large Language Models on Climate Information [5.034118180129635]
We present a comprehensive evaluation framework grounded in science communication research to assess Large Language Models (LLMs)
Our framework emphasizes both presentational responses and adequacy, offering a fine-grained analysis of LLM generations spanning 8 dimensions and 30 issues.
We introduce a novel protocol for scalable oversight that relies on AI Assistance and raters with relevant education.
arXiv Detail & Related papers (2023-10-04T16:09:48Z) - CLIMATE-FEVER: A Dataset for Verification of Real-World Climate Claims [4.574830585715129]
We introduce CLIMATE-FEVER, a new dataset for verification of climate change-related claims.
We adapt the methodology of FEVER [1], the largest dataset of artificially designed claims, to real-life claims collected from the Internet.
We discuss the surprising, subtle complexity of modeling real-world climate-related claims within the textscfever framework.
arXiv Detail & Related papers (2020-12-01T16:32:54Z) - Analyzing Sustainability Reports Using Natural Language Processing [68.8204255655161]
In recent years, companies have increasingly been aiming to both mitigate their environmental impact and adapt to the changing climate context.
This is reported via increasingly exhaustive reports, which cover many types of climate risks and exposures under the umbrella of Environmental, Social, and Governance (ESG)
We present this tool and the methodology that we used to develop it in the present article.
arXiv Detail & Related papers (2020-11-03T21:22:42Z) - Dynamical Landscape and Multistability of a Climate Model [64.467612647225]
We find a third intermediate stable state in one of the two climate models we consider.
The combination of our approaches allows to identify how the negative feedback of ocean heat transport and entropy production drastically change the topography of Earth's climate.
arXiv Detail & Related papers (2020-10-20T15:31:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.