Leveraging Language Models to Detect Greenwashing
- URL: http://arxiv.org/abs/2311.01469v1
- Date: Mon, 30 Oct 2023 21:41:49 GMT
- Title: Leveraging Language Models to Detect Greenwashing
- Authors: Avalon Vinella, Margaret Capetz, Rebecca Pattichis, Christina Chance,
and Reshmi Ghosh
- Abstract summary: We introduce a novel methodology to train a language model on generated labels for greenwashing risk.
Our primary contributions encompass: developing a mathematical formulation to quantify greenwashing risk, and a fine-tuned ClimateBERT model for this problem.
On a test set comprising of sustainability reports, our best model achieved an average accuracy score of 86.34% and F1 score of 0.67.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, climate change repercussions have increasingly captured
public interest. Consequently, corporations are emphasizing their environmental
efforts in sustainability reports to bolster their public image. Yet, the
absence of stringent regulations in review of such reports allows potential
greenwashing. In this study, we introduce a novel methodology to train a
language model on generated labels for greenwashing risk. Our primary
contributions encompass: developing a mathematical formulation to quantify
greenwashing risk, a fine-tuned ClimateBERT model for this problem, and a
comparative analysis of results. On a test set comprising of sustainability
reports, our best model achieved an average accuracy score of 86.34% and F1
score of 0.67, demonstrating that our methods show a promising direction of
exploration for this task.
Related papers
- EcoVerse: An Annotated Twitter Dataset for Eco-Relevance Classification, Environmental Impact Analysis, and Stance Detection [0.0]
EcoVerse is an annotated English Twitter dataset of 3,023 tweets spanning a wide spectrum of environmental topics.
We propose a three-level annotation scheme designed for Eco-Relevance Classification, Stance Detection, and introducing an original approach for Environmental Impact Analysis.
arXiv Detail & Related papers (2024-04-08T01:21:11Z) - A Comparative Study of Machine Learning Algorithms for Anomaly Detection
in Industrial Environments: Performance and Environmental Impact [62.997667081978825]
This study seeks to address the demands of high-performance machine learning models with environmental sustainability.
Traditional machine learning algorithms, such as Decision Trees and Random Forests, demonstrate robust efficiency and performance.
However, superior outcomes were obtained with optimised configurations, albeit with a commensurate increase in resource consumption.
arXiv Detail & Related papers (2023-07-01T15:18:00Z) - GEO-Bench: Toward Foundation Models for Earth Monitoring [139.77907168809085]
We propose a benchmark comprised of six classification and six segmentation tasks.
This benchmark will be a driver of progress across a variety of Earth monitoring tasks.
arXiv Detail & Related papers (2023-06-06T16:16:05Z) - Counting Carbon: A Survey of Factors Influencing the Emissions of
Machine Learning [77.62876532784759]
Machine learning (ML) requires using energy to carry out computations during the model training process.
The generation of this energy comes with an environmental cost in terms of greenhouse gas emissions, depending on quantity used and the energy source.
We present a survey of the carbon emissions of 95 ML models across time and different tasks in natural language processing and computer vision.
arXiv Detail & Related papers (2023-02-16T18:35:00Z) - Analysis of Biomass Sustainability Indicators from a Machine Learning
Perspective [4.129067364486898]
This study proposes a robust model for biomass sustainability prediction by analyzing sustainability indicators using machine learning models.
Ten machine learning models were analyzed to estimate three biomass sustainability indicators, namely soil erosion factor, soil conditioning index, and organic matter factor.
The results showed that Random Forest was the best performing model to assess sustainability indicators.
arXiv Detail & Related papers (2023-02-02T02:31:42Z) - mFACE: Multilingual Summarization with Factual Consistency Evaluation [79.60172087719356]
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets.
Despite promising results, current models still suffer from generating factually inconsistent summaries.
We leverage factual consistency evaluation models to improve multilingual summarization.
arXiv Detail & Related papers (2022-12-20T19:52:41Z) - Environmental Claim Detection [6.2887102994549595]
This paper introduces the task of environmental claim detection.
We release an expert-annotated dataset and models trained on this dataset.
We find that the number of environmental claims has steadily increased since the Paris Agreement in 2015.
arXiv Detail & Related papers (2022-09-01T14:51:07Z) - Measuring Wind Turbine Health Using Drifting Concepts [55.87342698167776]
We propose two new approaches for the analysis of wind turbine health.
The first method aims at evaluating the decrease or increase in relatively high and low power production.
The second method evaluates the overall drift of the extracted concepts.
arXiv Detail & Related papers (2021-12-09T14:04:55Z) - Analyzing Sustainability Reports Using Natural Language Processing [68.8204255655161]
In recent years, companies have increasingly been aiming to both mitigate their environmental impact and adapt to the changing climate context.
This is reported via increasingly exhaustive reports, which cover many types of climate risks and exposures under the umbrella of Environmental, Social, and Governance (ESG)
We present this tool and the methodology that we used to develop it in the present article.
arXiv Detail & Related papers (2020-11-03T21:22:42Z) - Crop Yield Prediction Integrating Genotype and Weather Variables Using
Deep Learning [8.786816847837976]
We use historical performance records from Uniform Soybean Tests (UST) in North America spanning 13 years of data to build a Long Short Term Memory - Recurrent Neural Network based model to dissect and predict genotype response in multiple environments.
We deploy this deep learning framework as a 'hypotheses generation tool' to unravel GxExM relationships.
We envision broad applicability of this approach (via conducting sensitivity analysis and "what-if" scenarios) for soybean and other crop species under different climatic conditions.
arXiv Detail & Related papers (2020-06-24T16:20:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.