Multimodal Banking Dataset: Understanding Client Needs through Event
Sequences
- URL: http://arxiv.org/abs/2409.17587v1
- Date: Thu, 26 Sep 2024 07:07:08 GMT
- Title: Multimodal Banking Dataset: Understanding Client Needs through Event
Sequences
- Authors: Mollaev Dzhambulat, Alexander Kostin, Postnova Maria, Ivan Karpukhin,
Ivan A Kireev, Gleb Gusev, Andrey Savchenko
- Abstract summary: We present the industrial-scale publicly available multimodal banking dataset, MBD, that contains more than 1.5M corporate clients.
All entries are properly anonymized from real proprietary bank data.
We provide numerical results that demonstrate the superiority of our multi-modal baselines over single-modal techniques for each task.
- Score: 41.470088044942756
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Financial organizations collect a huge amount of data about clients that
typically has a temporal (sequential) structure and is collected from various
sources (modalities). Due to privacy issues, there are no large-scale
open-source multimodal datasets of event sequences, which significantly limits
the research in this area. In this paper, we present the industrial-scale
publicly available multimodal banking dataset, MBD, that contains more than
1.5M corporate clients with several modalities: 950M bank transactions, 1B geo
position events, 5M embeddings of dialogues with technical support and monthly
aggregated purchases of four bank's products. All entries are properly
anonymized from real proprietary bank data. Using this dataset, we introduce a
novel benchmark with two business tasks: campaigning (purchase prediction in
the next month) and matching of clients. We provide numerical results that
demonstrate the superiority of our multi-modal baselines over single-modal
techniques for each task. As a result, the proposed dataset can open new
perspectives and facilitate the future development of practically important
large-scale multimodal algorithms for event sequences.
HuggingFace Link: https://huggingface.co/datasets/ai-lab/MBD
Github Link: https://github.com/Dzhambo/MBD
Related papers
- BabelBench: An Omni Benchmark for Code-Driven Analysis of Multimodal and Multistructured Data [61.936320820180875]
Large language models (LLMs) have become increasingly pivotal across various domains.
BabelBench is an innovative benchmark framework that evaluates the proficiency of LLMs in managing multimodal multistructured data with code execution.
Our experimental findings on BabelBench indicate that even cutting-edge models like ChatGPT 4 exhibit substantial room for improvement.
arXiv Detail & Related papers (2024-10-01T15:11:24Z) - InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning [58.7966588457529]
InfiMM-WebMath-40B is a high-quality dataset of interleaved image-text documents.
It comprises 24 million web pages, 85 million associated image URLs, and 40 billion text tokens, all meticulously extracted and filtered from CommonCrawl.
Our evaluations on text-only benchmarks show that, despite utilizing only 40 billion tokens, our dataset significantly enhances the performance of our 1.3B model.
Our models set a new state-of-the-art among open-source models on multi-modal math benchmarks such as MathVerse and We-Math.
arXiv Detail & Related papers (2024-09-19T08:41:21Z) - Adapting Segment Anything Model to Multi-modal Salient Object Detection with Semantic Feature Fusion Guidance [15.435695491233982]
We propose a novel framework to explore and exploit the powerful feature representation and zero-shot generalization ability of the Segment Anything Model (SAM) for multi-modal salient object detection (SOD)
We develop underlineSAM with seunderlinemantic funderlineeature fuunderlinesion guidancunderlinee (Sammese)
In the image encoder, a multi-modal adapter is proposed to adapt the single-modal SAM to multi-modal information. Specifically, in the mask decoder, a semantic-geometric
arXiv Detail & Related papers (2024-08-27T13:47:31Z) - MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens [113.9621845919304]
We release MINT-1T, the most extensive and diverse open-source Multimodal INTerleaved dataset to date.
MINT-1T comprises one trillion text tokens and 3.4 billion images, a 10x scale-up from existing open-source datasets.
Our experiments show that LMMs trained on MINT-1T rival the performance of models trained on the previous leading dataset, OBELICS.
arXiv Detail & Related papers (2024-06-17T07:21:36Z) - LaDe: The First Comprehensive Last-mile Delivery Dataset from Industry [44.573471568516915]
LaDe is the first publicly available last-mile delivery dataset with millions of packages from the industry.
It involves 10k packages of 21k couriers over 6 months of real-world operation.
LaDe has three unique characteristics: Large-scale, comprehensive, diverse.
arXiv Detail & Related papers (2023-06-19T02:30:28Z) - MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation
of Videos [106.06278332186106]
Multimodal summarization with multimodal output (MSMO) has emerged as a promising research direction.
Numerous limitations exist within existing public MSMO datasets.
We have meticulously curated the textbfMMSum dataset.
arXiv Detail & Related papers (2023-06-07T07:43:11Z) - M5Product: A Multi-modal Pretraining Benchmark for E-commercial Product
Downstream Tasks [94.80043324367858]
We contribute a large-scale dataset, named M5Product, which consists of over 6 million multimodal pairs.
M5Product contains rich information of multiple modalities including image, text, table, video and audio.
arXiv Detail & Related papers (2021-09-09T13:50:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.