Related papers: TTRS: Tinkoff Transactions Recommender System benchmark

TTRS: Tinkoff Transactions Recommender System benchmark

URL: http://arxiv.org/abs/2110.05589v1
Date: Mon, 11 Oct 2021 20:04:07 GMT
Title: TTRS: Tinkoff Transactions Recommender System benchmark
Authors: Sergey Kolesnikov, Oleg Lashinin, Michail Pechatov, Alexander Kosov
Abstract summary: We present the TTRS - Tinkoff Transactions Recommender System benchmark. This financial transaction benchmark contains over 2 million interactions between almost 10,000 users and more than 1,000 merchant brands over 14 months. We also present a comprehensive comparison of the current popular RecSys methods on the next-period recommendation task and conduct a detailed analysis of their performance against various metrics and recommendation goals.
Score: 62.997667081978825
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Over the past decade, tremendous progress has been made in inventing new RecSys methods. However, one of the fundamental problems of the RecSys research community remains the lack of applied datasets and benchmarks with well-defined evaluation rules and metrics to test these novel approaches. In this article, we present the TTRS - Tinkoff Transactions Recommender System benchmark. This financial transaction benchmark contains over 2 million interactions between almost 10,000 users and more than 1,000 merchant brands over 14 months. To the best of our knowledge, this is the first publicly available financial transactions dataset. To make it more suitable for possible applications, we provide a complete description of the data collection pipeline, its preprocessing, and the resulting dataset statistics. We also present a comprehensive comparison of the current popular RecSys methods on the next-period recommendation task and conduct a detailed analysis of their performance against various metrics and recommendation goals. Last but not least, we also introduce Personalized Item-Frequencies-based Model (Re)Ranker - PIFMR, a simple yet powerful approach that has proven to be the most effective for the benchmarked tasks.

Related papers

LLM-based IR-system for Bank Supervisors [0.0]
This paper introduces a novel Information Retrieval (IR) System tailored to assist supervisors in drafting both consistent and effective measures.<n>It ingests findings from on-site investigations and retrieves the most relevant historical findings and their associated measures from a comprehensive database.<n>The final model achieves a Mean Average Precision (MAP@100) of 0.83 and a Mean Reciprocal Rank (MRR@100) of 0.92.
arXiv Detail & Related papers (2025-08-04T23:02:01Z)
Benchmarking LLMs in Recommendation Tasks: A Comparative Evaluation with Conventional Recommenders [27.273217543282215]
We introduce RecBench, which evaluates two primary recommendation tasks, i.e., click-through rate prediction (CTR) and sequential recommendation (SeqRec) Our experiments cover up to 17 large models and are conducted across five diverse datasets from fashion, news, video, books, and music domains. Our findings indicate that LLM-based recommenders outperform conventional recommenders, achieving up to a 5% AUC improvement in the CTR scenario and up to a 170% NDCG@10 improvement in the SeqRec scenario.
arXiv Detail & Related papers (2025-03-07T15:05:23Z)
Scenario-Wise Rec: A Multi-Scenario Recommendation Benchmark [54.93461228053298]
We introduce our benchmark, textbfScenario-Wise Rec, which comprises 6 public datasets and 12 benchmark models, along with a training and evaluation pipeline. We aim for this benchmark to offer researchers valuable insights from prior work, enabling the development of novel models.
arXiv Detail & Related papers (2024-12-23T08:15:34Z)
Revisiting BPR: A Replicability Study of a Common Recommender System Baseline [78.00363373925758]
We study the features of the BPR model, indicating their impact on its performance, and investigate open-source BPR implementations. Our analysis reveals inconsistencies between these implementations and the original BPR paper, leading to a significant decrease in performance of up to 50% for specific implementations. We show that the BPR model can achieve performance levels close to state-of-the-art methods on the top-n recommendation tasks and even outperform them on specific datasets.
arXiv Detail & Related papers (2024-09-21T18:39:53Z)
Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning [70.22819290458581]
Reinforcement learning with human feedback (RLHF) is a widely adopted approach in current large language model pipelines. Our approach introduces two key innovations: (1) on-policy query to avoid OOD and imbalance issues in seed data, and (2) active learning to select the most informative data for preference queries.
arXiv Detail & Related papers (2024-07-02T10:09:19Z)
Multi-Modal Recommendation Unlearning [10.335361310419826]
This paper introduces MMRecUN, a new framework for multi-modal recommendation unlearning. Given the trained recommendation model and marked forget data, we devise Reverse Bayesian Personalized Ranking (BPR) objective to force the model to forget it. MMRecUN achieves recall performance improvements of up to $mathbf49.85%$ compared to the baseline methods.
arXiv Detail & Related papers (2024-05-24T08:11:59Z)
Maximizing Success Rate of Payment Routing using Non-stationary Bandits [5.781861264333114]
We propose a Ray-based implementation for optimally scaling bandit-based payment routing to over 10,000 transactions per second. We conducted live experiments on the payment transaction system on a fantasy sports platform Dream11. Our non-stationary bandit-based algorithm consistently improves the success rate of transactions by 0.92% compared to the traditional rule-based methods over one month.
arXiv Detail & Related papers (2023-08-02T09:23:16Z)
Sequential Recommendation Model for Next Purchase Prediction [2.8944480776764308]
We demonstrate and rank the effectiveness of a sequential recommendation system by utilizing a production dataset of over 2.7 million credit card transactions. We also discuss implications for embedding real-time predictions using the sequential RS into Nexus, a scalable, low-latency, event-based digital experience architecture.
arXiv Detail & Related papers (2022-07-06T17:42:58Z)
What are the best systems? New perspectives on NLP Benchmarking [10.27421161397197]
We propose a new procedure to rank systems based on their performance across different tasks. Motivated by the social choice theory, the final system ordering is obtained through aggregating the rankings induced by each task. We show that our method yields different conclusions on state-of-the-art systems than the mean-aggregation procedure.
arXiv Detail & Related papers (2022-02-08T11:44:20Z)
An Informative Tracking Benchmark [133.0931262969931]
We develop a small and informative tracking benchmark (ITB) with 7% out of 1.2 M frames of existing and newly collected datasets. We select the most informative sequences from existing benchmarks taking into account 1) challenging level, 2) discriminative strength, 3) and density of appearance variations. By analyzing the results of 15 state-of-the-art trackers re-trained on the same data, we determine the effective methods for robust tracking under each scenario.
arXiv Detail & Related papers (2021-12-13T07:56:16Z)
Sample-Rank: Weak Multi-Objective Recommendations Using Rejection Sampling [0.5156484100374059]
We introduce a method involving multi-goal sampling followed by ranking for user-relevance (Sample-Rank) to nudge recommendations towards multi-objective goals of the marketplace. The proposed method's novelty is that it reduces the MO recommendation problem to sampling from a desired multi-goal distribution then using it to build a production-friendly learning-to-rank model.
arXiv Detail & Related papers (2020-08-24T09:17:18Z)
Modeling Personalized Item Frequency Information for Next-basket Recommendation [63.94555438898309]
Next-basket recommendation (NBR) is prevalent in e-commerce and retail industry. We argue that existing RNNs cannot directly capture item frequency information in the recommendation scenario. We propose a simple item frequency based k-nearest neighbors (kNN) method to directly utilize these critical signals.
arXiv Detail & Related papers (2020-05-31T16:42:39Z)
Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples. We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries. We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.