Model-Agnostic Meta-Learning for Natural Language Understanding Tasks in
Finance
- URL: http://arxiv.org/abs/2303.02841v2
- Date: Sun, 26 Mar 2023 19:45:25 GMT
- Title: Model-Agnostic Meta-Learning for Natural Language Understanding Tasks in
Finance
- Authors: Bixing Yan, Shaoling Chen, Yuxuan He, Zhihan Li
- Abstract summary: We investigate model-agnostic meta-learning algorithm(MAML) in low-resource financial NLU tasks.
Our models achieve the state-of-the-art performance according to the experimental results.
- Score: 1.863067234952186
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Natural language understanding(NLU) is challenging for finance due to the
lack of annotated data and the specialized language in that domain. As a
result, researchers have proposed to use pre-trained language model and
multi-task learning to learn robust representations. However, aggressive
fine-tuning often causes over-fitting and multi-task learning may favor tasks
with significantly larger amounts data, etc. To address these problems, in this
paper, we investigate model-agnostic meta-learning algorithm(MAML) in
low-resource financial NLU tasks. Our contribution includes: 1. we explore the
performance of MAML method with multiple types of tasks: GLUE datasets, SNLI,
Sci-Tail and Financial PhraseBank; 2. we study the performance of MAML method
with multiple single-type tasks: a real scenario stock price prediction problem
with twitter text data. Our models achieve the state-of-the-art performance
according to the experimental results, which demonstrate that our method can
adapt fast and well to low-resource situations.
Related papers
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic [6.46176287368784]
We propose textbfModel textbfExclusive textbfTask textbfArithmetic for merging textbfGPT-scale models.
Our proposed MetaGPT is data-agnostic and bypasses the heavy search process, making it cost-effective and easy to implement for LLMs.
arXiv Detail & Related papers (2024-06-17T10:12:45Z) - Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs [49.57641083688934]
We introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings.
Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines.
arXiv Detail & Related papers (2024-06-05T20:19:09Z) - Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - Token-Efficient Leverage Learning in Large Language Models [13.830828529873056]
Large Language Models (LLMs) have excelled in various tasks but perform better in high-resource scenarios.
Data scarcity and the inherent difficulty of adapting LLMs to specific tasks compound the challenge.
We present a streamlined implement of this methodology called Token-Efficient Leverage Learning (TELL)
arXiv Detail & Related papers (2024-04-01T04:39:44Z) - Large Language Model Adaptation for Financial Sentiment Analysis [2.0499240875882]
Generalist language models tend to fall short in tasks specifically tailored for finance.
Two foundation models with less than 1.5B parameters have been adapted using a wide range of strategies.
We show that small LLMs have comparable performance to larger scale models, while being more efficient in terms of parameters and data.
arXiv Detail & Related papers (2024-01-26T11:04:01Z) - Masked Language Modeling and the Distributional Hypothesis: Order Word
Matters Pre-training for Little [74.49773960145681]
A possible explanation for the impressive performance of masked language model (MLM)-training is that such models have learned to represent the syntactic structures prevalent in NLP pipelines.
In this paper, we propose a different explanation: pre-trains succeed on downstream tasks almost entirely due to their ability to model higher-order word co-occurrence statistics.
Our results show that purely distributional information largely explains the success of pre-training, and underscore the importance of curating challenging evaluation datasets that require deeper linguistic knowledge.
arXiv Detail & Related papers (2021-04-14T06:30:36Z) - Adversarial Meta Sampling for Multilingual Low-Resource Speech
Recognition [159.9312272042253]
We develop a novel adversarial meta sampling (AMS) approach to improve multilingual meta-learning ASR (MML-ASR)
AMS adaptively determines the task sampling probability for each source language.
Experiment results on two multilingual datasets show significant performance improvement when applying our AMS on MML-ASR.
arXiv Detail & Related papers (2020-12-22T09:33:14Z) - Detecting ESG topics using domain-specific language models and data
augmentation approaches [3.3332986505989446]
Natural language processing tasks in the financial domain remain challenging due to paucity of appropriately labelled data.
Here, we investigate two approaches that may help to mitigate these issues.
Firstly, we experiment with further language model pre-training using large amounts of in-domain data from business and financial news.
We then apply augmentation approaches to increase the size of our dataset for model fine-tuning.
arXiv Detail & Related papers (2020-10-16T11:20:07Z) - When does MAML Work the Best? An Empirical Study on Model-Agnostic Meta-Learning in NLP Applications [26.458825286934857]
Many impacting factors, including data quantity, similarity among tasks, and the balance between general language model and task-specific adaptation, can affect the performance of MAML in NLP.
In this paper, we conduct an empirical study to investigate these impacting factors and conclude when MAML works the best based on the experimental results.
arXiv Detail & Related papers (2020-05-24T09:29:36Z) - Multi-Task Learning for Dense Prediction Tasks: A Survey [87.66280582034838]
Multi-task learning (MTL) techniques have shown promising results w.r.t. performance, computations and/or memory footprint.
We provide a well-rounded view on state-of-the-art deep learning approaches for MTL in computer vision.
arXiv Detail & Related papers (2020-04-28T09:15:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.