Self-Attention Mechanism in Multimodal Context for Banking Transaction Flow
- URL: http://arxiv.org/abs/2410.08243v1
- Date: Thu, 10 Oct 2024 08:13:39 GMT
- Title: Self-Attention Mechanism in Multimodal Context for Banking Transaction Flow
- Authors: Cyrile Delestre, Yoann Sola,
- Abstract summary: Banking Transaction Flow (BTF) is a sequential data composed of three modalities: a date, a numerical value and a wording.
We trained two general models on a large amount of BTFs in a self-supervised way.
The performance of these two models was evaluated on two banking downstream tasks: a transaction categorization task and a credit risk task.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Banking Transaction Flow (BTF) is a sequential data found in a number of banking activities such as marketing, credit risk or banking fraud. It is a multimodal data composed of three modalities: a date, a numerical value and a wording. We propose in this work an application of self-attention mechanism to the processing of BTFs. We trained two general models on a large amount of BTFs in a self-supervised way: one RNN-based model and one Transformer-based model. We proposed a specific tokenization in order to be able to process BTFs. The performance of these two models was evaluated on two banking downstream tasks: a transaction categorization task and a credit risk task. The results show that fine-tuning these two pre-trained models allowed to perform better than the state-of-the-art approaches for both tasks.
Related papers
- Towards a Foundation Purchasing Model: Pretrained Generative
Autoregression on Transaction Sequences [0.0]
We present a generative pretraining method that can be used to obtain contextualised embeddings of financial transactions.
We additionally perform large-scale pretraining of an embedding model using a corpus of data from 180 issuing banks containing 5.1 billion transactions.
arXiv Detail & Related papers (2024-01-03T09:32:48Z) - Cross-BERT for Point Cloud Pretraining [61.762046503448936]
We propose a new cross-modal BERT-style self-supervised learning paradigm, called Cross-BERT.
To facilitate pretraining for irregular and sparse point clouds, we design two self-supervised tasks to boost cross-modal interaction.
Our work highlights the effectiveness of leveraging cross-modal 2D knowledge to strengthen 3D point cloud representation and the transferable capability of BERT across modalities.
arXiv Detail & Related papers (2023-12-08T08:18:12Z) - Generative AI for End-to-End Limit Order Book Modelling: A Token-Level
Autoregressive Generative Model of Message Flow Using a Deep State Space
Network [7.54290390842336]
We propose an end-to-end autoregressive generative model that generates tokenized limit order book (LOB) messages.
Using NASDAQ equity LOBs, we develop a custom tokenizer for message data, converting groups of successive digits to tokens.
Results show promising performance in approximating the data distribution, as evidenced by low model perplexity.
arXiv Detail & Related papers (2023-08-23T09:37:22Z) - Multimodal Document Analytics for Banking Process Automation [4.541582055558865]
The paper contributes original empirical evidence on the effectiveness and efficiency of multi-model models for document processing in the banking business.
It offers practical guidance on how to unlock this potential in day-to-day operations.
arXiv Detail & Related papers (2023-07-21T18:29:04Z) - FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing [88.6654909354382]
We present a pure transformer-based framework, dubbed the Flexible Modal Vision Transformer (FM-ViT) for face anti-spoofing.
FM-ViT can flexibly target any single-modal (i.e., RGB) attack scenarios with the help of available multi-modal data.
Experiments demonstrate that the single model trained based on FM-ViT can not only flexibly evaluate different modal samples, but also outperforms existing single-modal frameworks by a large margin.
arXiv Detail & Related papers (2023-05-05T04:28:48Z) - An Empirical Study of Multimodal Model Merging [148.48412442848795]
Model merging is a technique that fuses multiple models trained on different tasks to generate a multi-task solution.
We conduct our study for a novel goal where we can merge vision, language, and cross-modal transformers of a modality-specific architecture.
We propose two metrics that assess the distance between weights to be merged and can serve as an indicator of the merging outcomes.
arXiv Detail & Related papers (2023-04-28T15:43:21Z) - Adapted Multimodal BERT with Layer-wise Fusion for Sentiment Analysis [84.12658971655253]
We propose Adapted Multimodal BERT, a BERT-based architecture for multimodal tasks.
adapter adjusts the pretrained language model for the task at hand, while the fusion layers perform task-specific, layer-wise fusion of audio-visual information with textual BERT representations.
In our ablations we see that this approach leads to efficient models, that can outperform their fine-tuned counterparts and are robust to input noise.
arXiv Detail & Related papers (2022-12-01T17:31:42Z) - Flexible categorization for auditing using formal concept analysis and
Dempster-Shafer theory [55.878249096379804]
We study different ways to categorize according to different extents of interest in different financial accounts.
The framework developed in this paper provides a formal ground to obtain and study explainable categorizations.
arXiv Detail & Related papers (2022-10-31T13:49:16Z) - Towards a Better Microcredit Decision [0.0]
We first define 3 stages with sequential dependence throughout the loan process including credit granting(AR), withdrawal application(WS) and repayment commitment(GB)
The proposed multi-stage interaction sequence(MSIS) method is simple yet effective and experimental results on a real data set from a top loan platform in China show the ability to remedy the population bias and improve model generalization ability.
arXiv Detail & Related papers (2022-08-23T12:24:19Z) - Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal
Sentiment Analysis [96.46952672172021]
Bi-Bimodal Fusion Network (BBFN) is a novel end-to-end network that performs fusion on pairwise modality representations.
Model takes two bimodal pairs as input due to known information imbalance among modalities.
arXiv Detail & Related papers (2021-07-28T23:33:42Z) - XtracTree: a Simple and Effective Method for Regulator Validation of
Bagging Methods Used in Retail Banking [0.0]
We propose XtracTree, an algorithm capable of efficiently converting an ML bagging classifier, such as a random forest, into simple "if-then" rules.
Our experiments demonstrate that using XtracTree, one can convert an ML model into a rule-based algorithm.
The proposed approach allowed our banking institution to reduce up to 50% the time of delivery of our AI solutions to the end-user.
arXiv Detail & Related papers (2020-04-05T21:57:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.