Discovering material information using hierarchical Reformer model on
financial regulatory filings
- URL: http://arxiv.org/abs/2204.05979v1
- Date: Mon, 28 Mar 2022 19:47:34 GMT
- Title: Discovering material information using hierarchical Reformer model on
financial regulatory filings
- Authors: Francois Mercier, Makesh Narsimhan
- Abstract summary: We build a hierarchical Reformer ([15]) model capable of processing a large document level dataset, SEDAR, from financial regulatory filings.
Using this model, we show that it is possible to predict trade volume changes using regulatory filings.
Finetuning the model to successfully predict trade volume changes indicates that the model captures a view from financial markets and processing regulatory filings is beneficial.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most applications of machine learning for finance are related to forecasting
tasks for investment decisions. Instead, we aim to promote a better
understanding of financial markets with machine learning techniques. Leveraging
the tremendous progress in deep learning models for natural language
processing, we construct a hierarchical Reformer ([15]) model capable of
processing a large document level dataset, SEDAR, from canadian financial
regulatory filings. Using this model, we show that it is possible to predict
trade volume changes using regulatory filings. We adapt the pretraining task of
HiBERT ([36]) to obtain good sentence level representations using a large
unlabelled document dataset. Finetuning the model to successfully predict trade
volume changes indicates that the model captures a view from financial markets
and processing regulatory filings is beneficial. Analyzing the attention
patterns of our model reveals that it is able to detect some indications of
material information without explicit training, which is highly relevant for
investors and also for the market surveillance mandate of financial regulators.
Related papers
- Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs [49.57641083688934]
We introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings.
Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines.
arXiv Detail & Related papers (2024-06-05T20:19:09Z) - AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework [48.3060010653088]
We release AlphaFin datasets, combining traditional research datasets, real-time financial data, and handwritten chain-of-thought (CoT) data.
We then use AlphaFin datasets to benchmark a state-of-the-art method, called Stock-Chain, for effectively tackling the financial analysis task.
arXiv Detail & Related papers (2024-03-19T09:45:33Z) - Numerical Claim Detection in Finance: A New Financial Dataset,
Weak-Supervision Model, and Market Analysis [4.9524454709622585]
We construct a new financial dataset for the claim detection task in the financial domain.
We propose a novel weak-supervision model that incorporates the knowledge of subject matter experts (SMEs) in the aggregation function.
We demonstrate the practical utility of our proposed model by constructing a novel measure optimism"
arXiv Detail & Related papers (2024-02-18T22:55:26Z) - Large Language Model Adaptation for Financial Sentiment Analysis [2.0499240875882]
Generalist language models tend to fall short in tasks specifically tailored for finance.
Two foundation models with less than 1.5B parameters have been adapted using a wide range of strategies.
We show that small LLMs have comparable performance to larger scale models, while being more efficient in terms of parameters and data.
arXiv Detail & Related papers (2024-01-26T11:04:01Z) - Revolutionizing Finance with LLMs: An Overview of Applications and
Insights [47.11391223936608]
Large Language Models (LLMs) like ChatGPT have seen considerable advancements and have been applied in diverse fields.
These models are being utilized for automating financial report generation, forecasting market trends, analyzing investor sentiment, and offering personalized financial advice.
arXiv Detail & Related papers (2024-01-22T01:06:17Z) - Towards a Foundation Purchasing Model: Pretrained Generative
Autoregression on Transaction Sequences [0.0]
We present a generative pretraining method that can be used to obtain contextualised embeddings of financial transactions.
We additionally perform large-scale pretraining of an embedding model using a corpus of data from 180 issuing banks containing 5.1 billion transactions.
arXiv Detail & Related papers (2024-01-03T09:32:48Z) - Multimodal Gen-AI for Fundamental Investment Research [2.559302299676632]
This report outlines a transformative initiative in the financial investment industry, where the conventional decision-making process is being reimagined.
We seek to evaluate the effectiveness of fine-tuning methods on a base model (Llama2) to achieve specific application-level goals.
The project encompasses a diverse corpus dataset, including research reports, investment memos, market news, and extensive time-series market data.
arXiv Detail & Related papers (2023-12-24T03:35:13Z) - Incorporating Pre-trained Model Prompting in Multimodal Stock Volume
Movement Prediction [22.949484374773967]
We propose the Prompt-based MUltimodal Stock volumE prediction model (ProMUSE) to process text and time series modalities.
We use pre-trained language models for better comprehension of financial news.
We also propose a novel cross-modality contrastive alignment while reserving the unimodal heads beside the fusion head to mitigate this problem.
arXiv Detail & Related papers (2023-09-11T16:47:01Z) - PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark
for Finance [63.51545277822702]
PIXIU is a comprehensive framework including the first financial large language model (LLMs) based on fine-tuning LLaMA with instruction data.
We propose FinMA by fine-tuning LLaMA with the constructed dataset to be able to follow instructions for various financial tasks.
We conduct a detailed analysis of FinMA and several existing LLMs, uncovering their strengths and weaknesses in handling critical financial tasks.
arXiv Detail & Related papers (2023-06-08T14:20:29Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Explainable Matrix -- Visualization for Global and Local
Interpretability of Random Forest Classification Ensembles [78.6363825307044]
We propose Explainable Matrix (ExMatrix), a novel visualization method for Random Forest (RF) interpretability.
It employs a simple yet powerful matrix-like visual metaphor, where rows are rules, columns are features, and cells are rules predicates.
ExMatrix applicability is confirmed via different examples, showing how it can be used in practice to promote RF models interpretability.
arXiv Detail & Related papers (2020-05-08T21:03:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.