Enhancing LLMs for Governance with Human Oversight: Evaluating and Aligning LLMs on Expert Classification of Climate Misinformation for Detecting False or Misleading Claims about Climate Change
- URL: http://arxiv.org/abs/2501.13802v1
- Date: Thu, 23 Jan 2025 16:21:15 GMT
- Title: Enhancing LLMs for Governance with Human Oversight: Evaluating and Aligning LLMs on Expert Classification of Climate Misinformation for Detecting False or Misleading Claims about Climate Change
- Authors: Mowafak Allaham, Ayse D. Lokmanoglu, Sol P. Hart, Erik C. Nisbet,
- Abstract summary: Climate misinformation is a problem that has the potential to be substantially aggravated by the development of Large Language Models (LLMs)
In this study we evaluate the potential for LLMs to be part of the solution for mitigating online dis/misinformation rather than the problem.
- Score: 0.0
- License:
- Abstract: Climate misinformation is a problem that has the potential to be substantially aggravated by the development of Large Language Models (LLMs). In this study we evaluate the potential for LLMs to be part of the solution for mitigating online dis/misinformation rather than the problem. Employing a public expert annotated dataset and a curated sample of social media content we evaluate the performance of proprietary vs. open source LLMs on climate misinformation classification task, comparing them to existing climate-focused computer-assisted tools and expert assessments. Results show (1) state-of-the-art (SOTA) open-source models substantially under-perform in classifying climate misinformation compared to proprietary models, (2) existing climate-focused computer-assisted tools leveraging expert-annotated datasets continues to outperform many of proprietary models, including GPT-4o, and (3) demonstrate the efficacy and generalizability of fine-tuning GPT-3.5-turbo on expert annotated dataset in classifying claims about climate change at the equivalency of climate change experts with over 20 years of experience in climate communication. These findings highlight 1) the importance of incorporating human-oversight, such as incorporating expert-annotated datasets in training LLMs, for governance tasks that require subject-matter expertise like classifying climate misinformation, and 2) the potential for LLMs in facilitating civil society organizations to engage in various governance tasks such as classifying false or misleading claims in domains beyond climate change such as politics and health science.
Related papers
- Exploring Large Language Models for Climate Forecasting [5.25781442142288]
Large language models (LLMs) present a promising approach to bridging the gap between complex climate data and the general public.
This study investigates the capability of GPT-4 in predicting rainfall at short-term (15-day) and long-term (12-month) scales.
arXiv Detail & Related papers (2024-11-20T21:58:19Z) - Evaluating Cultural and Social Awareness of LLM Web Agents [113.49968423990616]
We introduce CASA, a benchmark designed to assess large language models' sensitivity to cultural and social norms.
Our approach evaluates LLM agents' ability to detect and appropriately respond to norm-violating user queries and observations.
Experiments show that current LLMs perform significantly better in non-agent environments.
arXiv Detail & Related papers (2024-10-30T17:35:44Z) - Unlearning Climate Misinformation in Large Language Models [17.95497650321137]
Misinformation regarding climate change is a key roadblock in addressing one of the most serious threats to humanity.
This paper investigates factual accuracy in large language models (LLMs) regarding climate information.
arXiv Detail & Related papers (2024-05-29T23:11:53Z) - KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models [53.84677081899392]
KIEval is a Knowledge-grounded Interactive Evaluation framework for large language models.
It incorporates an LLM-powered "interactor" role for the first time to accomplish a dynamic contamination-resilient evaluation.
Extensive experiments on seven leading LLMs across five datasets validate KIEval's effectiveness and generalization.
arXiv Detail & Related papers (2024-02-23T01:30:39Z) - Climate Change from Large Language Models [7.190384101545232]
Climate change poses grave challenges, demanding widespread understanding and low-carbon lifestyle awareness.
Large language models (LLMs) offer a powerful tool to address this crisis.
This paper proposes an automated evaluation framework to assess climate-crisis knowledge.
arXiv Detail & Related papers (2023-12-19T09:26:46Z) - Arabic Mini-ClimateGPT : A Climate Change and Sustainability Tailored
Arabic LLM [77.17254959695218]
Large Language Models (LLMs) like ChatGPT and Bard have shown impressive conversational abilities and excel in a wide variety of NLP tasks.
We propose a light-weight Arabic Mini-ClimateGPT that is built on an open-source LLM and is specifically fine-tuned on a conversational-style instruction tuning Arabic dataset Clima500-Instruct.
Our model surpasses the baseline LLM in 88.3% of cases during ChatGPT-based evaluation.
arXiv Detail & Related papers (2023-12-14T22:04:07Z) - ClimateX: Do LLMs Accurately Assess Human Expert Confidence in Climate
Statements? [0.0]
We introduce the Expert Confidence in Climate Statements (ClimateX) dataset, a novel, curated, expert-labeled dataset consisting of 8094 climate statements.
Using this dataset, we show that recent Large Language Models (LLMs) can classify human expert confidence in climate-related statements.
Overall, models exhibit consistent and significant over-confidence on low and medium confidence statements.
arXiv Detail & Related papers (2023-11-28T10:26:57Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - On the Risk of Misinformation Pollution with Large Language Models [127.1107824751703]
We investigate the potential misuse of modern Large Language Models (LLMs) for generating credible-sounding misinformation.
Our study reveals that LLMs can act as effective misinformation generators, leading to a significant degradation in the performance of Open-Domain Question Answering (ODQA) systems.
arXiv Detail & Related papers (2023-05-23T04:10:26Z) - Enhancing Large Language Models with Climate Resources [5.2677629053588895]
Large language models (LLMs) have transformed the landscape of artificial intelligence by demonstrating their ability in generating human-like text.
However, they often employ imprecise language, which can be detrimental in domains where accuracy is crucial, such as climate change.
In this study, we make use of recent ideas to harness the potential of LLMs by viewing them as agents that access multiple sources.
We demonstrate the effectiveness of our method through a prototype agent that retrieves emission data from ClimateWatch.
arXiv Detail & Related papers (2023-03-31T20:24:14Z) - Analyzing Sustainability Reports Using Natural Language Processing [68.8204255655161]
In recent years, companies have increasingly been aiming to both mitigate their environmental impact and adapt to the changing climate context.
This is reported via increasingly exhaustive reports, which cover many types of climate risks and exposures under the umbrella of Environmental, Social, and Governance (ESG)
We present this tool and the methodology that we used to develop it in the present article.
arXiv Detail & Related papers (2020-11-03T21:22:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.