Your Spending Needs Attention: Modeling Financial Habits with Transformers
- URL: http://arxiv.org/abs/2507.23267v1
- Date: Thu, 31 Jul 2025 05:56:21 GMT
- Title: Your Spending Needs Attention: Modeling Financial Habits with Transformers
- Authors: D. T. Braithwaite, Misael Cavalcanti, R. Austin McEver, Hiroto Udagawa, Daniel Silva, Rohan Ramanath, Felipe Meneses, Arissa Yoshida, Evan Wingert, Matheus Ramos, Brian Zanfelice, Aman Gupta,
- Abstract summary: This paper investigates using transformer-based representation learning models for transaction data.<n>We propose a new method enabling the use of SSL with transaction data by adapting transformer-based models to handle both textual and structured attributes.
- Score: 2.5960274245156922
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Predictive models play a crucial role in the financial industry, enabling risk prediction, fraud detection, and personalized recommendations, where slight changes in core model performance can result in billions of dollars in revenue or losses. While financial institutions have access to enormous amounts of user data (e.g., bank transactions, in-app events, and customer support logs), leveraging this data effectively remains challenging due to its complexity and scale. Thus, in many financial institutions, most production models follow traditional machine learning (ML) approaches by converting unstructured data into manually engineered tabular features. Conversely, other domains (e.g., natural language processing) have effectively utilized self-supervised learning (SSL) to learn rich representations from raw data, removing the need for manual feature extraction. In this paper, we investigate using transformer-based representation learning models for transaction data, hypothesizing that these models, trained on massive data, can provide a novel and powerful approach to understanding customer behavior. We propose a new method enabling the use of SSL with transaction data by adapting transformer-based models to handle both textual and structured attributes. Our approach, denoted nuFormer, includes an end-to-end fine-tuning method that integrates user embeddings with existing tabular features. Our experiments demonstrate improvements for large-scale recommendation problems at Nubank. Notably, these gains are achieved solely through enhanced representation learning rather than incorporating new data sources.
Related papers
- Forgetting: A New Mechanism Towards Better Large Language Model Fine-tuning [53.398270878295754]
Supervised fine-tuning (SFT) plays a critical role for pretrained large language models (LLMs)<n>We suggest categorizing tokens within each corpus into two parts -- positive and negative tokens -- based on whether they are useful to improve model performance.<n>We conduct experiments on well-established benchmarks, finding that this forgetting mechanism not only improves overall model performance and also facilitate more diverse model responses.
arXiv Detail & Related papers (2025-08-06T11:22:23Z) - Meta-Statistical Learning: Supervised Learning of Statistical Inference [59.463430294611626]
This work demonstrates that the tools and principles driving the success of large language models (LLMs) can be repurposed to tackle distribution-level tasks.<n>We propose meta-statistical learning, a framework inspired by multi-instance learning that reformulates statistical inference tasks as supervised learning problems.
arXiv Detail & Related papers (2025-02-17T18:04:39Z) - Instruction-Based Fine-tuning of Open-Source LLMs for Predicting Customer Purchase Behaviors [1.4440000007050255]
The superior predictive capabilities of the fine-tuned Mistral Instruct model have been demonstrated.<n>The Mistral model significantly outperforms traditional sequential models.
arXiv Detail & Related papers (2025-01-28T11:34:22Z) - Class-Imbalanced-Aware Adaptive Dataset Distillation for Scalable Pretrained Model on Credit Scoring [10.737033782376905]
We present a novel framework for scaling up the application of large pretrained models on financial datasets.<n>We integrate the imbalance-aware techniques during dataset distillation, resulting in improved performance in financial datasets.
arXiv Detail & Related papers (2025-01-18T06:59:36Z) - Beyond Tree Models: A Hybrid Model of KAN and gMLP for Large-Scale Financial Tabular Data [28.34587057844627]
TKGMLP is a hybrid network for tabular data that combines shallow Kolmogorov Arnold Networks with Gated Multilayer Perceptron.<n>We validate TKGMLP on a real-world credit scoring dataset, where it achieves state-of-the-art results and outperforms current benchmarks.<n>We propose a novel feature encoding method for numerical data, specifically designed to address the predominance of numerical features in financial datasets.
arXiv Detail & Related papers (2024-12-03T02:38:07Z) - Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs [49.57641083688934]
We introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings.
Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines.
arXiv Detail & Related papers (2024-06-05T20:19:09Z) - Towards a Foundation Purchasing Model: Pretrained Generative
Autoregression on Transaction Sequences [0.0]
We present a generative pretraining method that can be used to obtain contextualised embeddings of financial transactions.
We additionally perform large-scale pretraining of an embedding model using a corpus of data from 180 issuing banks containing 5.1 billion transactions.
arXiv Detail & Related papers (2024-01-03T09:32:48Z) - Meta Knowledge Condensation for Federated Learning [65.20774786251683]
Existing federated learning paradigms usually extensively exchange distributed models at a central solver to achieve a more powerful model.
This would incur severe communication burden between a server and multiple clients especially when data distributions are heterogeneous.
Unlike existing paradigms, we introduce an alternative perspective to significantly decrease the communication cost in federate learning.
arXiv Detail & Related papers (2022-09-29T15:07:37Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.