Multimodal Temporal Fusion Transformers Are Good Product Demand
Forecasters
- URL: http://arxiv.org/abs/2307.02578v1
- Date: Wed, 5 Jul 2023 18:23:13 GMT
- Title: Multimodal Temporal Fusion Transformers Are Good Product Demand
Forecasters
- Authors: Maarten Sukel, Stevan Rudinac, Marcel Worring
- Abstract summary: Multimodal demand forecasting aims at predicting product demand utilizing visual, textual, and contextual information.
This paper proposes a method for multimodal product demand forecasting using convolutional, graph-based, and transformer-based architectures.
- Score: 18.52252059555198
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimodal demand forecasting aims at predicting product demand utilizing
visual, textual, and contextual information. This paper proposes a method for
multimodal product demand forecasting using convolutional, graph-based, and
transformer-based architectures. Traditional approaches to demand forecasting
rely on historical demand, product categories, and additional contextual
information such as seasonality and events. However, these approaches have
several shortcomings, such as the cold start problem making it difficult to
predict product demand until sufficient historical data is available for a
particular product, and their inability to properly deal with category
dynamics. By incorporating multimodal information, such as product images and
textual descriptions, our architecture aims to address the shortcomings of
traditional approaches and outperform them. The experiments conducted on a
large real-world dataset show that the proposed approach effectively predicts
demand for a wide range of products. The multimodal pipeline presented in this
work enhances the accuracy and reliability of the predictions, demonstrating
the potential of leveraging multimodal information in product demand
forecasting.
Related papers
- LLMForecaster: Improving Seasonal Event Forecasts with Unstructured Textual Data [63.777637042161544]
This paper introduces a novel forecast post-processor that fine-tunes large language models to incorporate unstructured semantic and contextual information and historical data.
In an industry-scale retail application, we demonstrate that our technique yields statistically significantly forecast improvements across several sets of products subject to holiday-driven demand surges.
arXiv Detail & Related papers (2024-12-03T16:18:42Z) - Context Matters: Leveraging Contextual Features for Time Series Forecasting [2.9687381456164004]
We introduce ContextFormer, a novel plug-and-play method to surgically integrate multimodal contextual information into existing forecasting models.
ContextFormer effectively distills forecast-specific information from rich multimodal contexts, including categorical, continuous, time-varying, and even textual information.
It outperforms SOTA forecasting models by up to 30% on a range of real-world datasets spanning energy, traffic, environmental, and financial domains.
arXiv Detail & Related papers (2024-10-16T15:36:13Z) - Inter-Series Transformer: Attending to Products in Time Series Forecasting [5.459207333107234]
We develop a new Transformer-based forecasting approach using a shared, multi-task per-time series network.
We provide a case study applying our approach to successfully improve demand prediction for a medical device manufacturing company.
arXiv Detail & Related papers (2024-08-07T16:22:21Z) - Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development [67.55944651679864]
We present a new sandbox suite tailored for integrated data-model co-development.
This sandbox provides a feedback-driven experimental platform, enabling cost-effective and guided refinement of both data and models.
arXiv Detail & Related papers (2024-07-16T14:40:07Z) - F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond [87.1712108247199]
Our goal is to establish a Unified paradigm for Multi-modal Personalization systems (UniMP)
We develop a generic and personalization generative framework, that can handle a wide range of personalized needs.
Our methodology enhances the capabilities of foundational language models for personalized tasks.
arXiv Detail & Related papers (2024-03-15T20:21:31Z) - Incorporating Pre-trained Model Prompting in Multimodal Stock Volume
Movement Prediction [22.949484374773967]
We propose the Prompt-based MUltimodal Stock volumE prediction model (ProMUSE) to process text and time series modalities.
We use pre-trained language models for better comprehension of financial news.
We also propose a novel cross-modality contrastive alignment while reserving the unimodal heads beside the fusion head to mitigate this problem.
arXiv Detail & Related papers (2023-09-11T16:47:01Z) - Deep Learning based Forecasting: a case study from the online fashion
industry [7.694480564850072]
We describe the data and our modelling approach for this forecasting problem in detail and present empirical results.
In this case study, we describe the data and our modelling approach for this forecasting problem in detail and present empirical results.
arXiv Detail & Related papers (2023-05-23T13:30:35Z) - Multimodal Neural Network For Demand Forecasting [0.8602553195689513]
We propose a multi-modal sales forecasting network that combines real-life events from news articles with traditional data such as historical sales and holiday information.
We show statistically significant improvements in the SMAPE error metric with an average improvement of 7.37% against the existing state-of-the-art sales forecasting techniques.
arXiv Detail & Related papers (2022-10-20T18:06:36Z) - Product1M: Towards Weakly Supervised Instance-Level Product Retrieval
via Cross-modal Pretraining [108.86502855439774]
We investigate a more realistic setting that aims to perform weakly-supervised multi-modal instance-level product retrieval.
We contribute Product1M, one of the largest multi-modal cosmetic datasets for real-world instance-level retrieval.
We propose a novel model named Cross-modal contrAstive Product Transformer for instance-level prodUct REtrieval (CAPTURE)
arXiv Detail & Related papers (2021-07-30T12:11:24Z) - Pre-training Graph Transformer with Multimodal Side Information for
Recommendation [82.4194024706817]
We propose a pre-training strategy to learn item representations by considering both item side information and their relationships.
We develop a novel sampling algorithm named MCNSampling to select contextual neighbors for each item.
The proposed Pre-trained Multimodal Graph Transformer (PMGT) learns item representations with two objectives: 1) graph structure reconstruction, and 2) masked node feature reconstruction.
arXiv Detail & Related papers (2020-10-23T10:30:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.