Fine-tuning ClimateBert transformer with ClimaText for the disclosure
analysis of climate-related financial risks
- URL: http://arxiv.org/abs/2303.13373v1
- Date: Tue, 21 Mar 2023 07:25:36 GMT
- Title: Fine-tuning ClimateBert transformer with ClimaText for the disclosure
analysis of climate-related financial risks
- Authors: Eduardo C. Garrido-Merch\'an, Cristina Gonz\'alez-Barthe, Mar\'ia
Coronado Vaca
- Abstract summary: This paper applies state-of-the-art NLP techniques to achieve the detection of climate change in text corpora.
We use transfer learning to fine-tune two transformer models, BERT and ClimateBert.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years there has been a growing demand from financial agents,
especially from particular and institutional investors, for companies to report
on climate-related financial risks. A vast amount of information, in text
format, can be expected to be disclosed in the short term by firms in order to
identify these types of risks in their financial and non financial reports,
particularly in response to the growing regulation that is being passed on the
matter. To this end, this paper applies state-of-the-art NLP techniques to
achieve the detection of climate change in text corpora. We use transfer
learning to fine-tune two transformer models, BERT and ClimateBert -a recently
published DistillRoBERTa-based model that has been specifically tailored for
climate text classification-. These two algorithms are based on the transformer
architecture which enables learning the contextual relationships between words
in a text. We carry out the fine-tuning process of both models on the novel
Clima-Text database, consisting of data collected from Wikipedia, 10K Files
Reports and web-based claims. Our text classification model obtained from the
ClimateBert fine-tuning process on ClimaText, outperforms the models created
with BERT and the current state-of-the-art transformer in this particular
problem. Our study is the first one to implement on the ClimaText database the
recently published ClimateBert algorithm. Based on our results, it can be said
that ClimateBert fine-tuned on ClimaText is an outstanding tool within the NLP
pre-trained transformer models that may and should be used by investors,
institutional agents and companies themselves to monitor the disclosure of
climate risk in financial reports. In addition, our transfer learning
methodology is cheap in computational terms, thus allowing any organization to
perform it.
Related papers
- ClimaQA: An Automated Evaluation Framework for Climate Foundation Models [38.05357439484919]
We develop ClimaGen, an automated framework that generates question-answer pairs from graduate textbooks with climate scientists in the loop.
We present ClimaQA-Gold, an expert-annotated benchmark dataset alongside ClimaQA-Silver, a large-scale, comprehensive synthetic QA dataset for climate science.
arXiv Detail & Related papers (2024-10-22T05:12:19Z) - Machine Learning for Methane Detection and Quantification from Space -- A survey [49.7996292123687]
Methane (CH_4) is a potent anthropogenic greenhouse gas, contributing 86 times more to global warming than Carbon Dioxide (CO_2) over 20 years.
This work expands existing information on operational methane point source detection sensors in the Short-Wave Infrared (SWIR) bands.
It reviews the state-of-the-art for traditional as well as Machine Learning (ML) approaches.
arXiv Detail & Related papers (2024-08-27T15:03:20Z) - Comparing Data-Driven and Mechanistic Models for Predicting Phenology in
Deciduous Broadleaf Forests [47.285748922842444]
We train a deep neural network to predict a phenological index from meteorological time series.
We find that this approach outperforms traditional process-based models.
arXiv Detail & Related papers (2024-01-08T15:29:23Z) - Arabic Mini-ClimateGPT : A Climate Change and Sustainability Tailored
Arabic LLM [77.17254959695218]
Large Language Models (LLMs) like ChatGPT and Bard have shown impressive conversational abilities and excel in a wide variety of NLP tasks.
We propose a light-weight Arabic Mini-ClimateGPT that is built on an open-source LLM and is specifically fine-tuned on a conversational-style instruction tuning Arabic dataset Clima500-Instruct.
Our model surpasses the baseline LLM in 88.3% of cases during ChatGPT-based evaluation.
arXiv Detail & Related papers (2023-12-14T22:04:07Z) - ClimaX: A foundation model for weather and climate [51.208269971019504]
ClimaX is a deep learning model for weather and climate science.
It can be pre-trained with a self-supervised learning objective on climate datasets.
It can be fine-tuned to address a breadth of climate and weather tasks.
arXiv Detail & Related papers (2023-01-24T23:19:01Z) - Towards Answering Climate Questionnaires from Unstructured Climate
Reports [26.036105166376284]
Activists and policymakers need NLP tools to process the vast and rapidly growing unstructured textual climate reports into structured form.
We introduce two new large-scale climate questionnaire datasets and use their existing structure to train self-supervised models.
We then use these models to help align texts from unstructured climate documents to the semi-structured questionnaires in a human pilot study.
arXiv Detail & Related papers (2023-01-11T00:22:56Z) - Multi-scale Digital Twin: Developing a fast and physics-informed
surrogate model for groundwater contamination with uncertain climate models [53.44486283038738]
Climate change exacerbates the long-term soil management problem of groundwater contamination.
We develop a physics-informed machine learning surrogate model using U-Net enhanced Fourier Neural Contaminated (PDENO)
In parallel, we develop a convolutional autoencoder combined with climate data to reduce the dimensionality of climatic region similarities across the United States.
arXiv Detail & Related papers (2022-11-20T06:46:35Z) - Climate-Invariant Machine Learning [0.8831201550856289]
Current climate models require representations of processes that occur at scales smaller than model grid size.
Recent machine learning (ML) algorithms hold promise to improve such process representations, but tend to extrapolate poorly to climate regimes they were not trained on.
We propose a new framework - termed "climate-invariant" ML - incorporating knowledge of climate processes into ML algorithms.
arXiv Detail & Related papers (2021-12-14T07:02:57Z) - CLIMATE-FEVER: A Dataset for Verification of Real-World Climate Claims [4.574830585715129]
We introduce CLIMATE-FEVER, a new dataset for verification of climate change-related claims.
We adapt the methodology of FEVER [1], the largest dataset of artificially designed claims, to real-life claims collected from the Internet.
We discuss the surprising, subtle complexity of modeling real-world climate-related claims within the textscfever framework.
arXiv Detail & Related papers (2020-12-01T16:32:54Z) - Analyzing Sustainability Reports Using Natural Language Processing [68.8204255655161]
In recent years, companies have increasingly been aiming to both mitigate their environmental impact and adapt to the changing climate context.
This is reported via increasingly exhaustive reports, which cover many types of climate risks and exposures under the umbrella of Environmental, Social, and Governance (ESG)
We present this tool and the methodology that we used to develop it in the present article.
arXiv Detail & Related papers (2020-11-03T21:22:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.