ESGBERT: Language Model to Help with Classification Tasks Related to
Companies Environmental, Social, and Governance Practices
- URL: http://arxiv.org/abs/2203.16788v1
- Date: Thu, 31 Mar 2022 04:22:44 GMT
- Title: ESGBERT: Language Model to Help with Classification Tasks Related to
Companies Environmental, Social, and Governance Practices
- Authors: Srishti Mehra, Robert Louka, Yixun Zhang
- Abstract summary: Non-financial factors such as environmental, social, and governance (ESG) are attracting attention from investors.
We see a need for sophisticated NLP techniques for classification tasks for ESG text.
We explore doing this by fine-tuning BERTs pre-trained weights using ESG specific text and then further fine-tuning the model for a classification task.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Environmental, Social, and Governance (ESG) are non-financial factors that
are garnering attention from investors as they increasingly look to apply these
as part of their analysis to identify material risks and growth opportunities.
Some of this attention is also driven by clients who, now more aware than ever,
are demanding for their money to be managed and invested responsibly. As the
interest in ESG grows, so does the need for investors to have access to
consumable ESG information. Since most of it is in text form in reports,
disclosures, press releases, and 10-Q filings, we see a need for sophisticated
NLP techniques for classification tasks for ESG text. We hypothesize that an
ESG domain-specific pre-trained model will help with such and study building of
the same in this paper. We explored doing this by fine-tuning BERTs pre-trained
weights using ESG specific text and then further fine-tuning the model for a
classification task. We were able to achieve accuracy better than the original
BERT and baseline models in environment-specific classification tasks.
Related papers
- Evaluating the performance of state-of-the-art esg domain-specific pre-trained large language models in text classification against existing models and traditional machine learning techniques [0.0]
This research investigates the classification of Environmental, Social, and Governance (ESG) information within textual disclosures.
The aim is to develop and evaluate binary classification models capable of accurately identifying and categorizing E, S and G-related content respectively.
The motivation for this research stems from the growing importance of ESG considerations in investment decisions and corporate accountability.
arXiv Detail & Related papers (2024-09-30T20:08:32Z) - Measuring Sustainability Intention of ESG Fund Disclosure using Few-Shot Learning [1.1957520154275776]
This paper proposes a unique method and system to classify and score the fund prospectuses in the sustainable universe.
We employ few-shot learners to identify specific, ambiguous, and generic sustainable investment-related language.
We construct a ratio metric to determine language score and rating to rank products and quantify sustainability claims.
arXiv Detail & Related papers (2024-07-09T14:25:23Z) - ESG-FTSE: A corpus of news articles with ESG relevance labels and use cases [1.3937696730884712]
We present ESG-FTSE, the first corpus comprised of news articles with Environmental, Social and Governance (ESG) relevance annotations.
This has led to the rise of ESG scores to evaluate an investment's credentials as socially responsible.
Quantitative techniques can be applied to improve ESG scores, thus, responsible investing.
arXiv Detail & Related papers (2024-05-30T16:19:02Z) - Exploiting Contextual Target Attributes for Target Sentiment
Classification [53.30511968323911]
Existing PTLM-based models for TSC can be categorized into two groups: 1) fine-tuning-based models that adopt PTLM as the context encoder; 2) prompting-based models that transfer the classification task to the text/word generation task.
We present a new perspective of leveraging PTLM for TSC: simultaneously leveraging the merits of both language modeling and explicit target-context interactions via contextual target attributes.
arXiv Detail & Related papers (2023-12-21T11:45:28Z) - Towards Verifiable Generation: A Benchmark for Knowledge-aware Language Model Attribution [48.86322922826514]
This paper defines a new task of Knowledge-aware Language Model Attribution (KaLMA)
First, we extend attribution source from unstructured texts to Knowledge Graph (KG), whose rich structures benefit both the attribution performance and working scenarios.
Second, we propose a new Conscious Incompetence" setting considering the incomplete knowledge repository.
Third, we propose a comprehensive automatic evaluation metric encompassing text quality, citation quality, and text citation alignment.
arXiv Detail & Related papers (2023-10-09T11:45:59Z) - Leveraging BERT Language Models for Multi-Lingual ESG Issue
Identification [0.30254881201174333]
Investors have increasingly recognized the significance of ESG criteria in their investment choices.
The Multi-Lingual ESG Issue Identification (ML-ESG) task encompasses the classification of news documents into 35 distinct ESG issue labels.
In this study, we explored multiple strategies harnessing BERT language models to achieve accurate classification of news documents across these labels.
arXiv Detail & Related papers (2023-09-05T12:48:21Z) - GPT-NER: Named Entity Recognition via Large Language Models [58.609582116612934]
GPT-NER transforms the sequence labeling task to a generation task that can be easily adapted by Language Models.
We find that GPT-NER exhibits a greater ability in the low-resource and few-shot setups, when the amount of training data is extremely scarce.
This demonstrates the capabilities of GPT-NER in real-world NER applications where the number of labeled examples is limited.
arXiv Detail & Related papers (2023-04-20T16:17:26Z) - Predicting Companies' ESG Ratings from News Articles Using Multivariate
Timeseries Analysis [17.332692582748408]
We build a model to predict ESG ratings from news articles using the combination of multivariate timeseries construction and deep learning techniques.
A news dataset for about 3,000 US companies together with their ratings is also created and released for training.
Our approach provides accurate results outperforming the state-of-the-art, and can be used in practice to support a manual determination or analysis of ESG ratings.
arXiv Detail & Related papers (2022-11-13T11:23:02Z) - Guiding Generative Language Models for Data Augmentation in Few-Shot
Text Classification [59.698811329287174]
We leverage GPT-2 for generating artificial training instances in order to improve classification performance.
Our results show that fine-tuning GPT-2 in a handful of label instances leads to consistent classification improvements.
arXiv Detail & Related papers (2021-11-17T12:10:03Z) - DAGA: Data Augmentation with a Generation Approach for Low-resource
Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences.
Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z) - Common Sense or World Knowledge? Investigating Adapter-Based Knowledge
Injection into Pretrained Transformers [54.417299589288184]
We investigate models for complementing the distributional knowledge of BERT with conceptual knowledge from ConceptNet and its corresponding Open Mind Common Sense (OMCS) corpus.
Our adapter-based models substantially outperform BERT on inference tasks that require the type of conceptual knowledge explicitly present in ConceptNet and OMCS.
arXiv Detail & Related papers (2020-05-24T15:49:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.