Related papers: Leveraging Language Models to Detect Greenwashing

Leveraging Language Models to Detect Greenwashing

URL: http://arxiv.org/abs/2311.01469v1
Date: Mon, 30 Oct 2023 21:41:49 GMT
Title: Leveraging Language Models to Detect Greenwashing
Authors: Avalon Vinella, Margaret Capetz, Rebecca Pattichis, Christina Chance, and Reshmi Ghosh
Abstract summary: We introduce a novel methodology to train a language model on generated labels for greenwashing risk. Our primary contributions encompass: developing a mathematical formulation to quantify greenwashing risk, and a fine-tuned ClimateBERT model for this problem. On a test set comprising of sustainability reports, our best model achieved an average accuracy score of 86.34% and F1 score of 0.67.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In recent years, climate change repercussions have increasingly captured public interest. Consequently, corporations are emphasizing their environmental efforts in sustainability reports to bolster their public image. Yet, the absence of stringent regulations in review of such reports allows potential greenwashing. In this study, we introduce a novel methodology to train a language model on generated labels for greenwashing risk. Our primary contributions encompass: developing a mathematical formulation to quantify greenwashing risk, a fine-tuned ClimateBERT model for this problem, and a comparative analysis of results. On a test set comprising of sustainability reports, our best model achieved an average accuracy score of 86.34% and F1 score of 0.67, demonstrating that our methods show a promising direction of exploration for this task.

Related papers

Corporate Greenwashing Detection in Text - a Survey [5.958302080525902]
Greenwashing is an effort to mislead the public about the environmental impact of an entity, such as a state or company. We provide a comprehensive survey of the scientific literature addressing natural language processing methods to identify greenwashing.
arXiv Detail & Related papers (2025-02-11T13:28:56Z)
Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations [63.52709761339949]
We first contribute a dedicated dataset called the Fair Forgery Detection (FairFD) dataset, where we prove the racial bias of public state-of-the-art (SOTA) methods. We design novel metrics including Approach Averaged Metric and Utility Regularized Metric, which can avoid deceptive results. We also present an effective and robust post-processing technique, Bias Pruning with Fair Activations (BPFA), which improves fairness without requiring retraining or weight updates.
arXiv Detail & Related papers (2024-07-19T14:53:18Z)
EcoVerse: An Annotated Twitter Dataset for Eco-Relevance Classification, Environmental Impact Analysis, and Stance Detection [0.0]
EcoVerse is an annotated English Twitter dataset of 3,023 tweets spanning a wide spectrum of environmental topics. We propose a three-level annotation scheme designed for Eco-Relevance Classification, Stance Detection, and introducing an original approach for Environmental Impact Analysis.
arXiv Detail & Related papers (2024-04-08T01:21:11Z)
Long-term drought prediction using deep neural networks based on geospatial weather data [75.38539438000072]
High-quality drought forecasting up to a year in advance is critical for agriculture planning and insurance. We tackle drought data by introducing an end-to-end approach that adopts a systematic end-to-end approach. Key findings are the exceptional performance of a Transformer model, EarthFormer, in making accurate short-term (up to six months) forecasts.
arXiv Detail & Related papers (2023-09-12T13:28:06Z)
GEO-Bench: Toward Foundation Models for Earth Monitoring [139.77907168809085]
We propose a benchmark comprised of six classification and six segmentation tasks. This benchmark will be a driver of progress across a variety of Earth monitoring tasks.
arXiv Detail & Related papers (2023-06-06T16:16:05Z)
Counting Carbon: A Survey of Factors Influencing the Emissions of Machine Learning [77.62876532784759]
Machine learning (ML) requires using energy to carry out computations during the model training process. The generation of this energy comes with an environmental cost in terms of greenhouse gas emissions, depending on quantity used and the energy source. We present a survey of the carbon emissions of 95 ML models across time and different tasks in natural language processing and computer vision.
arXiv Detail & Related papers (2023-02-16T18:35:00Z)
Analysis of Biomass Sustainability Indicators from a Machine Learning Perspective [4.129067364486898]
This study proposes a robust model for biomass sustainability prediction by analyzing sustainability indicators using machine learning models. Ten machine learning models were analyzed to estimate three biomass sustainability indicators, namely soil erosion factor, soil conditioning index, and organic matter factor. The results showed that Random Forest was the best performing model to assess sustainability indicators.
arXiv Detail & Related papers (2023-02-02T02:31:42Z)
Environmental Claim Detection [6.2887102994549595]
This paper introduces the task of environmental claim detection. We release an expert-annotated dataset and models trained on this dataset. We find that the number of environmental claims has steadily increased since the Paris Agreement in 2015.
arXiv Detail & Related papers (2022-09-01T14:51:07Z)
Analyzing Sustainability Reports Using Natural Language Processing [68.8204255655161]
In recent years, companies have increasingly been aiming to both mitigate their environmental impact and adapt to the changing climate context. This is reported via increasingly exhaustive reports, which cover many types of climate risks and exposures under the umbrella of Environmental, Social, and Governance (ESG) We present this tool and the methodology that we used to develop it in the present article.
arXiv Detail & Related papers (2020-11-03T21:22:42Z)
Crop Yield Prediction Integrating Genotype and Weather Variables Using Deep Learning [8.786816847837976]
We use historical performance records from Uniform Soybean Tests (UST) in North America spanning 13 years of data to build a Long Short Term Memory - Recurrent Neural Network based model to dissect and predict genotype response in multiple environments. We deploy this deep learning framework as a 'hypotheses generation tool' to unravel GxExM relationships. We envision broad applicability of this approach (via conducting sensitivity analysis and "what-if" scenarios) for soybean and other crop species under different climatic conditions.
arXiv Detail & Related papers (2020-06-24T16:20:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.