Approaching sales forecasting using recurrent neural networks and
transformers
- URL: http://arxiv.org/abs/2204.07786v1
- Date: Sat, 16 Apr 2022 12:03:52 GMT
- Title: Approaching sales forecasting using recurrent neural networks and
transformers
- Authors: Iv\'an Vall\'es-P\'erez, Emilio Soria-Olivas, Marcelino
Mart\'inez-Sober, Antonio J. Serrano-L\'opez, Juan G\'omez-Sanch\'is,
Fernando Mateo
- Abstract summary: We develop three alternatives to tackle the problem of forecasting the customer sales at day/store/item level using deep learning techniques.
Our empirical results show how good performance can be achieved by using a simple sequence to sequence architecture with minimal data preprocessing effort.
The proposed solution achieves a RMSLE of around 0.54, which is competitive with other more specific solutions to the problem proposed in the Kaggle competition.
- Score: 57.43518732385863
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Accurate and fast demand forecast is one of the hot topics in supply chain
for enabling the precise execution of the corresponding downstream processes
(inbound and outbound planning, inventory placement, network planning, etc). We
develop three alternatives to tackle the problem of forecasting the customer
sales at day/store/item level using deep learning techniques and the
Corporaci\'on Favorita data set, published as part of a Kaggle competition. Our
empirical results show how good performance can be achieved by using a simple
sequence to sequence architecture with minimal data preprocessing effort.
Additionally, we describe a training trick for making the model more time
independent and hence improving generalization over time. The proposed solution
achieves a RMSLE of around 0.54, which is competitive with other more specific
solutions to the problem proposed in the Kaggle competition.
Related papers
- A Scalable Pretraining Framework for Link Prediction with Efficient Adaptation [16.82426251068573]
Link Prediction (LP) is a critical task in graph machine learning.<n>Existing methods face key challenges including limited supervision from sparse connectivity.<n>We explore pretraining as a solution to address these challenges.
arXiv Detail & Related papers (2025-08-06T17:10:31Z) - TAT: Temporal-Aligned Transformer for Multi-Horizon Peak Demand Forecasting [51.37167759339485]
We propose Temporal-Aligned Transformer (TAT), a multi-horizon forecaster leveraging apriori-known context variables for improving predictive performance.<n>Our model consists of an encoder and decoder, both embedded with a novel Temporal Alignment Attention (TAA) designed to learn context-dependent alignment for peak demand forecasting.<n>We demonstrate that TAT brings up to 30% accuracy on peak demand forecasting while maintaining competitive overall performance compared to other state-of-the-art methods.
arXiv Detail & Related papers (2025-07-14T14:51:24Z) - Comparative Analysis of Modern Machine Learning Models for Retail Sales Forecasting [0.0]
When forecasts underestimate the level of sales, firms experience lost sales, shortages, and impact on the reputation of the retailer in their relevant market.<n>This study provides an exhaustive assessment of the forecasting models applied to a high-resolution brick-and-mortar retail dataset.
arXiv Detail & Related papers (2025-06-06T10:08:17Z) - Intelligent Routing for Sparse Demand Forecasting: A Comparative Evaluation of Selection Strategies [0.6798775532273751]
parse and intermittent demand forecasting in supply chains presents a critical challenge.<n>We propose a Model-spanning framework that selects the most suitable forecasting model-spanning classical, ML, and DL methods for each product.<n>Experiments on the large-scale Favorita dataset show our deep learning (Inception Time) router improves forecasting accuracy by up to 11.8%.
arXiv Detail & Related papers (2025-06-04T03:09:45Z) - RIFLES: Resource-effIcient Federated LEarning via Scheduling [4.358456799125694]
Federated Learning (FL) is a privacy-preserving machine learning technique that allows decentralized collaborative model training across a set of distributed clients.<n>Current selection strategies are myopic in nature in that they are based on past or current interactions.<n>RIFLES builds a novel availability forecasting layer to support the client selection process.
arXiv Detail & Related papers (2025-05-19T14:26:33Z) - Neural Conformal Control for Time Series Forecasting [54.96087475179419]
We introduce a neural network conformal prediction method for time series that enhances adaptivity in non-stationary environments.
Our approach acts as a neural controller designed to achieve desired target coverage, leveraging auxiliary multi-view data with neural network encoders.
We empirically demonstrate significant improvements in coverage and probabilistic accuracy, and find that our method is the only one that combines good calibration with consistency in prediction intervals.
arXiv Detail & Related papers (2024-12-24T03:56:25Z) - Inter-Series Transformer: Attending to Products in Time Series Forecasting [5.459207333107234]
We develop a new Transformer-based forecasting approach using a shared, multi-task per-time series network.
We provide a case study applying our approach to successfully improve demand prediction for a medical device manufacturing company.
arXiv Detail & Related papers (2024-08-07T16:22:21Z) - F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - Split-Boost Neural Networks [1.1549572298362787]
We propose an innovative training strategy for feed-forward architectures - called split-boost.
Such a novel approach ultimately allows us to avoid explicitly modeling the regularization term.
The proposed strategy is tested on a real-world (anonymized) dataset within a benchmark medical insurance design problem.
arXiv Detail & Related papers (2023-09-06T17:08:57Z) - Towards Accelerated Model Training via Bayesian Data Selection [45.62338106716745]
We propose a more reasonable data selection principle by examining the data's impact on the model's generalization loss.
Recent work has proposed a more reasonable data selection principle by examining the data's impact on the model's generalization loss.
This work solves these problems by leveraging a lightweight Bayesian treatment and incorporating off-the-shelf zero-shot predictors built on large-scale pre-trained models.
arXiv Detail & Related papers (2023-08-21T07:58:15Z) - Adaptive Siamese Tracking with a Compact Latent Network [219.38172719948048]
We present an intuitive viewing to simplify the Siamese-based trackers by converting the tracking task to a classification.
Under this viewing, we perform an in-depth analysis for them through visual simulations and real tracking examples.
We apply it to adjust three classical Siamese-based trackers, namely SiamRPN++, SiamFC, and SiamBAN.
arXiv Detail & Related papers (2023-02-02T08:06:02Z) - Estimating Task Completion Times for Network Rollouts using Statistical
Models within Partitioning-based Regression Methods [0.01841601464419306]
This paper proposes a data and Machine Learning-based forecasting solution for the Telecommunications network-rollout planning problem.
Using historical data of milestone completion times, a model needs to incorporate domain knowledge, handle noise and yet be interpretable to project managers.
This paper proposes partition-based regression models that incorporate data-driven statistical models within each partition, as a solution to the problem.
arXiv Detail & Related papers (2022-11-20T04:28:12Z) - Augmented Bilinear Network for Incremental Multi-Stock Time-Series
Classification [83.23129279407271]
We propose a method to efficiently retain the knowledge available in a neural network pre-trained on a set of securities.
In our method, the prior knowledge encoded in a pre-trained neural network is maintained by keeping existing connections fixed.
This knowledge is adjusted for the new securities by a set of augmented connections, which are optimized using the new data.
arXiv Detail & Related papers (2022-07-23T18:54:10Z) - Beyond Transfer Learning: Co-finetuning for Action Localisation [64.07196901012153]
We propose co-finetuning -- simultaneously training a single model on multiple upstream'' and downstream'' tasks.
We demonstrate that co-finetuning outperforms traditional transfer learning when using the same total amount of data.
We also show how we can easily extend our approach to multiple upstream'' datasets to further improve performance.
arXiv Detail & Related papers (2022-07-08T10:25:47Z) - DANCE: DAta-Network Co-optimization for Efficient Segmentation Model
Training and Inference [85.02494022662505]
DANCE is an automated simultaneous data-network co-optimization for efficient segmentation model training and inference.
It integrates automated data slimming which adaptively downsamples/drops input images and controls their corresponding contribution to the training loss guided by the images' spatial complexity.
Experiments and ablating studies demonstrate that DANCE can achieve "all-win" towards efficient segmentation.
arXiv Detail & Related papers (2021-07-16T04:58:58Z) - A machine learning approach for forecasting hierarchical time series [4.157415305926584]
We propose a machine learning approach for forecasting hierarchical time series.
Forecast reconciliation is the process of adjusting forecasts to make them coherent across the hierarchy.
We exploit the ability of a deep neural network to extract information capturing the structure of the hierarchy.
arXiv Detail & Related papers (2020-05-31T22:26:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.