Related papers: Transformer Choice Net: A Transformer Neural Network for Choice Prediction

Transformer Choice Net: A Transformer Neural Network for Choice Prediction

URL: http://arxiv.org/abs/2310.08716v1
Date: Thu, 12 Oct 2023 20:54:10 GMT
Title: Transformer Choice Net: A Transformer Neural Network for Choice Prediction
Authors: Hanzhao Wang, Xiaocheng Li, Kalyan Talluri
Abstract summary: We develop a neural network architecture, the Transformer Choice Net, that is suitable for predicting multiple choices. Transformer networks turn out to be especially suitable for this task as they take into account not only the features of the customer and the items but also the context. Our architecture shows uniformly superior out-of-sample prediction performance compared to the leading models in the literature.
Score: 6.6543199581017625
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Discrete-choice models, such as Multinomial Logit, Probit, or Mixed-Logit, are widely used in Marketing, Economics, and Operations Research: given a set of alternatives, the customer is modeled as choosing one of the alternatives to maximize a (latent) utility function. However, extending such models to situations where the customer chooses more than one item (such as in e-commerce shopping) has proven problematic. While one can construct reasonable models of the customer's behavior, estimating such models becomes very challenging because of the combinatorial explosion in the number of possible subsets of items. In this paper we develop a transformer neural network architecture, the Transformer Choice Net, that is suitable for predicting multiple choices. Transformer networks turn out to be especially suitable for this task as they take into account not only the features of the customer and the items but also the context, which in this case could be the assortment as well as the customer's past choices. On a range of benchmark datasets, our architecture shows uniformly superior out-of-sample prediction performance compared to the leading models in the literature, without requiring any custom modeling or tuning for each instance.

Related papers

B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable [53.848005910548565]
'B-cosification' is a novel approach to transform existing pre-trained models to become inherently interpretable. We find that B-cosification can yield models that are on par with B-cos models trained from scratch in terms of interpretability.
arXiv Detail & Related papers (2024-11-01T16:28:11Z)
Budgeted Online Model Selection and Fine-Tuning via Federated Learning [26.823435733330705]
Online model selection involves selecting a model from a set of candidate models 'on the fly' to perform prediction on a stream of data. The choice of candidate models henceforth has a crucial impact on the performance. The present paper proposes an online federated model selection framework where a group of learners (clients) interacts with a server with sufficient memory. Using the proposed algorithm, clients and the server collaborate to fine-tune models to adapt them to a non-stationary environment.
arXiv Detail & Related papers (2024-01-19T04:02:49Z)
A Neural Network Based Choice Model for Assortment Optimization [5.173001988341294]
We investigate whether a single neural network architecture can predict purchase probabilities for datasets from various contexts. Next, we develop an assortment optimization formulation that is solvable by off-the-shelf integer programming solvers.
arXiv Detail & Related papers (2023-08-10T15:01:52Z)
Fast Adaptation with Bradley-Terry Preference Models in Text-To-Image Classification and Generation [0.0]
We leverage the Bradley-Terry preference model to develop a fast adaptation method that efficiently fine-tunes the original model. Extensive evidence of the capabilities of this framework is provided through experiments in different domains related to multimodal text and image understanding.
arXiv Detail & Related papers (2023-07-15T07:53:12Z)
Exploring and Evaluating Personalized Models for Code Generation [9.25440316608194]
We evaluate transformer model fine-tuning for personalization. We consider three key approaches: (i) custom fine-tuning, which allows all the model parameters to be tuned. We compare these fine-tuning strategies for code generation and discuss the potential generalization and cost benefits of each in various deployment scenarios.
arXiv Detail & Related papers (2022-08-29T23:28:46Z)
A Statistical-Modelling Approach to Feedforward Neural Network Model Selection [0.8287206589886881]
Feedforward neural networks (FNNs) can be viewed as non-linear regression models. A novel model selection method is proposed using the Bayesian information criterion (BIC) for FNNs. The choice of BIC over out-of-sample performance leads to an increased probability of recovering the true model.
arXiv Detail & Related papers (2022-07-09T11:07:04Z)
Slimmable Domain Adaptation [112.19652651687402]
We introduce a simple framework, Slimmable Domain Adaptation, to improve cross-domain generalization with a weight-sharing model bank. Our framework surpasses other competing approaches by a very large margin on multiple benchmarks.
arXiv Detail & Related papers (2022-06-14T06:28:04Z)
MetaQA: Combining Expert Agents for Multi-Skill Question Answering [49.35261724460689]
We argue that despite the promising results of multi-dataset models, some domains or QA formats might require specific architectures. We propose to combine expert agents with a novel, flexible, and training-efficient architecture that considers questions, answer predictions, and answer-prediction confidence scores.
arXiv Detail & Related papers (2021-12-03T14:05:52Z)
PreSizE: Predicting Size in E-Commerce using Transformers [76.33790223551074]
PreSizE is a novel deep learning framework which utilizes Transformers for accurate size prediction. We demonstrate that PreSizE is capable of achieving superior prediction performance compared to previous state-of-the-art baselines. As a proof of concept, we demonstrate that size predictions made by PreSizE can be effectively integrated into an existing production recommender system.
arXiv Detail & Related papers (2021-05-04T15:23:59Z)
A linearized framework and a new benchmark for model selection for fine-tuning [112.20527122513668]
Fine-tuning from a collection of models pre-trained on different domains is emerging as a technique to improve test accuracy in the low-data regime. We introduce two new baselines for model selection -- Label-Gradient and Label-Feature Correlation. Our benchmark highlights accuracy gain with model zoo compared to fine-tuning Imagenet models.
arXiv Detail & Related papers (2021-01-29T21:57:15Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.