Removing Non-Stationary Knowledge From Pre-Trained Language Models for
Entity-Level Sentiment Classification in Finance
- URL: http://arxiv.org/abs/2301.03136v1
- Date: Mon, 9 Jan 2023 01:26:55 GMT
- Title: Removing Non-Stationary Knowledge From Pre-Trained Language Models for
Entity-Level Sentiment Classification in Finance
- Authors: Guijin Son, Hanwool Lee, Nahyeon Kang, Moonjeong Hahm
- Abstract summary: We build KorFinASC, a Korean aspect-level sentiment classification dataset for finance consisting of 12,613 human-annotated samples.
We use the term "non-stationary knowledge'' to refer to information that was previously correct but is likely to change, and present "TGT-Masking'', a novel masking pattern.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Extraction of sentiment signals from news text, stock message boards, and
business reports, for stock movement prediction, has been a rising field of
interest in finance. Building upon past literature, the most recent works
attempt to better capture sentiment from sentences with complex syntactic
structures by introducing aspect-level sentiment classification (ASC). Despite
the growing interest, however, fine-grained sentiment analysis has not been
fully explored in non-English literature due to the shortage of annotated
finance-specific data. Accordingly, it is necessary for non-English languages
to leverage datasets and pre-trained language models (PLM) of different
domains, languages, and tasks to best their performance. To facilitate
finance-specific ASC research in the Korean language, we build KorFinASC, a
Korean aspect-level sentiment classification dataset for finance consisting of
12,613 human-annotated samples, and explore methods of intermediate transfer
learning. Our experiments indicate that past research has been ignorant towards
the potentially wrong knowledge of financial entities encoded during the
training phase, which has overestimated the predictive power of PLMs. In our
work, we use the term "non-stationary knowledge'' to refer to information that
was previously correct but is likely to change, and present "TGT-Masking'', a
novel masking pattern to restrict PLMs from speculating knowledge of the kind.
Finally, through a series of transfer learning with TGT-Masking applied we
improve 22.63% of classification accuracy compared to standalone models on
KorFinASC.
Related papers
- Enhancing Text Classification through LLM-Driven Active Learning and Human Annotation [2.0411082897313984]
This study introduces a novel methodology that integrates human annotators and Large Language Models.
The proposed framework integrates human annotation with the output of LLMs, depending on the model uncertainty levels.
The empirical results show a substantial decrease in the costs associated with data annotation while either maintaining or improving model accuracy.
arXiv Detail & Related papers (2024-06-17T21:45:48Z) - Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs [49.57641083688934]
We introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings.
Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines.
arXiv Detail & Related papers (2024-06-05T20:19:09Z) - Prefix Text as a Yarn: Eliciting Non-English Alignment in Foundation Language Model [50.339632513018934]
supervised fine-tuning (SFT) has been a straightforward approach for tailoring the output of foundation large language model (LLM) to specific preferences.
We critically examine this hypothesis within the scope of cross-lingual generation tasks.
We introduce a novel training-free alignment method named PreTTY, which employs minimal task-related prior tokens.
arXiv Detail & Related papers (2024-04-25T17:19:36Z) - Ukrainian Texts Classification: Exploration of Cross-lingual Knowledge Transfer Approaches [11.508759658889382]
There is a tremendous lack of Ukrainian corpora for typical text classification tasks.
We explore cross-lingual knowledge transfer methods avoiding manual data curation.
We test the approaches on three text classification tasks.
arXiv Detail & Related papers (2024-04-02T15:37:09Z) - ESG Classification by Implicit Rule Learning via GPT-4 [1.9702372005978506]
This paper investigates whether state-of-the-art language models like GPT-4 can be guided to align with unknown ESG evaluation criteria.
We demonstrate the efficacy of these approaches by ranking 2nd in the Shared-Task ML-ESG-3 Impact Type track for Korean without updating the model on the provided training data.
arXiv Detail & Related papers (2024-03-22T08:45:30Z) - Natural Language Processing for Dialects of a Language: A Survey [56.93337350526933]
State-of-the-art natural language processing (NLP) models are trained on massive training corpora, and report a superlative performance on evaluation datasets.
This survey delves into an important attribute of these datasets: the dialect of a language.
Motivated by the performance degradation of NLP models for dialectic datasets and its implications for the equity of language technologies, we survey past research in NLP for dialects in terms of datasets, and approaches.
arXiv Detail & Related papers (2024-01-11T03:04:38Z) - Pre-trained Large Language Models for Financial Sentiment Analysis [10.683185786541596]
We adapt the open-source Llama2-7B model (2023) with the supervised fine-tuning (SFT) technique.
Our approach significantly outperforms the previous state-of-the-art algorithms.
arXiv Detail & Related papers (2024-01-10T15:27:41Z) - LLaMA Beyond English: An Empirical Study on Language Capability Transfer [49.298360366468934]
We focus on how to effectively transfer the capabilities of language generation and following instructions to a non-English language.
We analyze the impact of key factors such as vocabulary extension, further pretraining, and instruction tuning on transfer.
We employ four widely used standardized testing benchmarks: C-Eval, MMLU, AGI-Eval, and GAOKAO-Bench.
arXiv Detail & Related papers (2024-01-02T06:29:02Z) - Cross-Lingual NER for Financial Transaction Data in Low-Resource
Languages [70.25418443146435]
We propose an efficient modeling framework for cross-lingual named entity recognition in semi-structured text data.
We employ two independent datasets of SMSs in English and Arabic, each carrying semi-structured banking transaction information.
With access to only 30 labeled samples, our model can generalize the recognition of merchants, amounts, and other fields from English to Arabic.
arXiv Detail & Related papers (2023-07-16T00:45:42Z) - Understanding Translationese in Cross-Lingual Summarization [106.69566000567598]
Cross-lingual summarization (MS) aims at generating a concise summary in a different target language.
To collect large-scale CLS data, existing datasets typically involve translation in their creation.
In this paper, we first confirm that different approaches of constructing CLS datasets will lead to different degrees of translationese.
arXiv Detail & Related papers (2022-12-14T13:41:49Z) - Detecting ESG topics using domain-specific language models and data
augmentation approaches [3.3332986505989446]
Natural language processing tasks in the financial domain remain challenging due to paucity of appropriately labelled data.
Here, we investigate two approaches that may help to mitigate these issues.
Firstly, we experiment with further language model pre-training using large amounts of in-domain data from business and financial news.
We then apply augmentation approaches to increase the size of our dataset for model fine-tuning.
arXiv Detail & Related papers (2020-10-16T11:20:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.