Glitter or Gold? Deriving Structured Insights from Sustainability
Reports via Large Language Models
- URL: http://arxiv.org/abs/2310.05628v3
- Date: Tue, 16 Jan 2024 14:02:07 GMT
- Title: Glitter or Gold? Deriving Structured Insights from Sustainability
Reports via Large Language Models
- Authors: Marco Bronzini, Carlo Nicolini, Bruno Lepri, Andrea Passerini, Jacopo
Staiano
- Abstract summary: This study uses Information Extraction (IE) methods to extract structured insights related to ESG aspects from companies' sustainability reports.
We then leverage graph-based representations to conduct statistical analyses concerning the extracted insights.
- Score: 16.231171704561714
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Over the last decade, several regulatory bodies have started requiring the
disclosure of non-financial information from publicly listed companies, in
light of the investors' increasing attention to Environmental, Social, and
Governance (ESG) issues. Publicly released information on sustainability
practices is often disclosed in diverse, unstructured, and multi-modal
documentation. This poses a challenge in efficiently gathering and aligning the
data into a unified framework to derive insights related to Corporate Social
Responsibility (CSR). Thus, using Information Extraction (IE) methods becomes
an intuitive choice for delivering insightful and actionable data to
stakeholders. In this study, we employ Large Language Models (LLMs), In-Context
Learning, and the Retrieval-Augmented Generation (RAG) paradigm to extract
structured insights related to ESG aspects from companies' sustainability
reports. We then leverage graph-based representations to conduct statistical
analyses concerning the extracted insights. These analyses revealed that ESG
criteria cover a wide range of topics, exceeding 500, often beyond those
considered in existing categorizations, and are addressed by companies through
a variety of initiatives. Moreover, disclosure similarities emerged among
companies from the same region or sector, validating ongoing hypotheses in the
ESG literature. Lastly, by incorporating additional company attributes into our
analyses, we investigated which factors impact the most on companies' ESG
ratings, showing that ESG disclosure affects the obtained ratings more than
other financial or company data.
Related papers
- Trustworthiness in Retrieval-Augmented Generation Systems: A Survey [59.26328612791924]
Retrieval-Augmented Generation (RAG) has quickly grown into a pivotal paradigm in the development of Large Language Models (LLMs)
We propose a unified framework that assesses the trustworthiness of RAG systems across six key dimensions: factuality, robustness, fairness, transparency, accountability, and privacy.
arXiv Detail & Related papers (2024-09-16T09:06:44Z) - On the Societal Impact of Open Foundation Models [93.67389739906561]
We focus on open foundation models, defined here as those with broadly available model weights.
We identify five distinctive properties of open foundation models that lead to both their benefits and risks.
arXiv Detail & Related papers (2024-02-27T16:49:53Z) - CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models [49.16989035566899]
Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of large language models (LLMs) by incorporating external knowledge sources.
This paper constructs a large-scale and more comprehensive benchmark, and evaluates all the components of RAG systems in various RAG application scenarios.
arXiv Detail & Related papers (2024-01-30T14:25:32Z) - ESGReveal: An LLM-based approach for extracting structured data from ESG
reports [5.467389155759699]
ESGReveal is an innovative method proposed for efficiently extracting and analyzing Environmental, Social, and Governance (ESG) data from corporate reports.
This approach utilizes Large Language Models (LLM) enhanced with Retrieval Augmented Generation (RAG) techniques.
Its efficacy was appraised using ESG reports from 166 companies across various sectors listed on the Hong Kong Stock Exchange in 2022.
arXiv Detail & Related papers (2023-12-25T06:44:32Z) - Modeling the Evolutionary Trends in Corporate ESG Reporting: A Study based on Knowledge Management Model [0.08999666725996973]
We analyzed 1114 ESG reports from firms in the technology industry to analyze the evolutionary trends of ESG topics by text mining.
We discovered the homogenization effect towards low environmental, medium governance, and high social features in the evolution.
We found that companies are gradually converging towards the third quadrant, which indicates that firms contribute less to industrial outstanding and professional distinctiveness in ESG reporting.
arXiv Detail & Related papers (2023-09-13T14:54:51Z) - Paradigm Shift in Sustainability Disclosure Analysis: Empowering
Stakeholders with CHATREPORT, a Language Model-Based Tool [10.653984116770234]
This paper introduces a novel approach to enhance Large Language Models (LLMs) with expert knowledge to automate the analysis of corporate sustainability reports.
We christen our tool CHATREPORT, and apply it in a first use case to assess corporate climate risk disclosures.
arXiv Detail & Related papers (2023-06-27T14:46:47Z) - Sentiment Analysis in the Era of Large Language Models: A Reality Check [69.97942065617664]
This paper investigates the capabilities of large language models (LLMs) in performing various sentiment analysis tasks.
We evaluate performance across 13 tasks on 26 datasets and compare the results against small language models (SLMs) trained on domain-specific datasets.
arXiv Detail & Related papers (2023-05-24T10:45:25Z) - Predicting Companies' ESG Ratings from News Articles Using Multivariate
Timeseries Analysis [17.332692582748408]
We build a model to predict ESG ratings from news articles using the combination of multivariate timeseries construction and deep learning techniques.
A news dataset for about 3,000 US companies together with their ratings is also created and released for training.
Our approach provides accurate results outperforming the state-of-the-art, and can be used in practice to support a manual determination or analysis of ESG ratings.
arXiv Detail & Related papers (2022-11-13T11:23:02Z) - GENEVA: Benchmarking Generalizability for Event Argument Extraction with
Hundreds of Event Types and Argument Roles [77.05288144035056]
Event Argument Extraction (EAE) has focused on improving model generalizability to cater to new events and domains.
Standard benchmarking datasets like ACE and ERE cover less than 40 event types and 25 entity-centric argument roles.
arXiv Detail & Related papers (2022-05-25T05:46:28Z) - Survey of Aspect-based Sentiment Analysis Datasets [55.61047894397937]
Aspect-based sentiment analysis (ABSA) is a natural language processing problem that requires analyzing user-generated reviews.
Numerous yet scattered corpora for ABSA make it difficult for researchers to identify corpora best suited for a specific ABSA subtask quickly.
This study aims to present a database of corpora that can be used to train and assess autonomous ABSA systems.
arXiv Detail & Related papers (2022-04-11T16:23:36Z) - Heterogeneous Ensemble for ESG Ratings Prediction [1.9659095632676094]
Investors rely on specialized rating agencies that issue ratings along the environmental, social and governance dimensions.
Rating agencies base their analysis on subjective assessment of sustainability reports, not provided by every company.
We propose a heterogeneous ensemble model to predict ESG ratings using fundamental data.
arXiv Detail & Related papers (2021-09-21T10:42:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.