MQRetNN: Multi-Horizon Time Series Forecasting with Retrieval
Augmentation
- URL: http://arxiv.org/abs/2207.10517v1
- Date: Thu, 21 Jul 2022 14:51:58 GMT
- Title: MQRetNN: Multi-Horizon Time Series Forecasting with Retrieval
Augmentation
- Authors: Sitan Yang and Carson Eisenach and Dhruv Madeka
- Abstract summary: Multi-horizon probabilistic time series forecasting has wide applicability to real-world tasks such as demand forecasting.
Recent work in neural time-series forecasting mainly focus on the use of Seq2Seq architectures.
We consider incorporating cross-entity information to enhance model performance by adding a cross-entity attention mechanism along with a retrieval mechanism to select which entities to attend over.
- Score: 1.8692254863855964
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-horizon probabilistic time series forecasting has wide applicability to
real-world tasks such as demand forecasting. Recent work in neural time-series
forecasting mainly focus on the use of Seq2Seq architectures. For example,
MQTransformer - an improvement of MQCNN - has shown the state-of-the-art
performance in probabilistic demand forecasting. In this paper, we consider
incorporating cross-entity information to enhance model performance by adding a
cross-entity attention mechanism along with a retrieval mechanism to select
which entities to attend over. We demonstrate how our new neural architecture,
MQRetNN, leverages the encoded contexts from a pretrained baseline model on the
entire population to improve forecasting accuracy. Using MQCNN as the baseline
model (due to computational constraints, we do not use MQTransformer), we first
show on a small demand forecasting dataset that it is possible to achieve ~3%
improvement in test loss by adding a cross-entity attention mechanism where
each entity attends to all others in the population. We then evaluate the model
with our proposed retrieval methods - as a means of approximating an attention
over a large population - on a large-scale demand forecasting application with
over 2 million products and observe ~1% performance gain over the MQCNN
baseline.
Related papers
- F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - MCDFN: Supply Chain Demand Forecasting via an Explainable Multi-Channel Data Fusion Network Model [0.0]
We introduce the Multi-Channel Data Fusion Network (MCDFN), a hybrid architecture that integrates CNN, Long Short-Term Memory networks (LSTM), and Gated Recurrent Units (GRU)
Our comparative benchmarking demonstrates that MCDFN outperforms seven other deep-learning models.
This research advances demand forecasting methodologies and offers practical guidelines for integrating MCDFN into supply chain systems.
arXiv Detail & Related papers (2024-05-24T14:30:00Z) - Probabilistic MIMO U-Net: Efficient and Accurate Uncertainty Estimation
for Pixel-wise Regression [1.4528189330418977]
Uncertainty estimation in machine learning is paramount for enhancing the reliability and interpretability of predictive models.
We present an adaptation of the Multiple-Input Multiple-Output (MIMO) framework for pixel-wise regression tasks.
arXiv Detail & Related papers (2023-08-14T22:08:28Z) - Systematic Architectural Design of Scale Transformed Attention Condenser
DNNs via Multi-Scale Class Representational Response Similarity Analysis [93.0013343535411]
We propose a novel type of analysis called Multi-Scale Class Representational Response Similarity Analysis (ClassRepSim)
We show that adding STAC modules to ResNet style architectures can result in up to a 1.6% increase in top-1 accuracy.
Results from ClassRepSim analysis can be used to select an effective parameterization of the STAC module resulting in competitive performance.
arXiv Detail & Related papers (2023-06-16T18:29:26Z) - CEP3: Community Event Prediction with Neural Point Process on Graph [59.434777403325604]
We propose a novel model combining Graph Neural Networks and Marked Temporal Point Process (MTPP)
Our experiments demonstrate the superior performance of our model in terms of both model accuracy and training efficiency.
arXiv Detail & Related papers (2022-05-21T15:30:25Z) - Complex Event Forecasting with Prediction Suffix Trees: Extended
Technical Report [70.7321040534471]
Complex Event Recognition (CER) systems have become popular in the past two decades due to their ability to "instantly" detect patterns on real-time streams of events.
There is a lack of methods for forecasting when a pattern might occur before such an occurrence is actually detected by a CER engine.
We present a formal framework that attempts to address the issue of Complex Event Forecasting.
arXiv Detail & Related papers (2021-09-01T09:52:31Z) - Uncertainty-Aware Learning for Improvements in Image Quality of the
Canada-France-Hawaii Telescope [9.963669010212012]
We leverage state-of-the-art machine learning methods to predict observatory image quality (IQ) from environmental conditions and observatory operating parameters.
We develop accurate and interpretable models of the complex dependence between data features and observed IQ for CFHT's wide field camera, MegaCam.
arXiv Detail & Related papers (2021-06-30T18:10:20Z) - Once Quantization-Aware Training: High Performance Extremely Low-bit
Architecture Search [112.05977301976613]
We propose to combine Network Architecture Search methods with quantization to enjoy the merits of the two sides.
We first propose the joint training of architecture and quantization with a shared step size to acquire a large number of quantized models.
Then a bit-inheritance scheme is introduced to transfer the quantized models to the lower bit, which further reduces the time cost and improves the quantization accuracy.
arXiv Detail & Related papers (2020-10-09T03:52:16Z) - Quantile Surfaces -- Generalizing Quantile Regression to Multivariate
Targets [4.979758772307178]
Our approach is based on an extension of single-output quantile regression (QR) to multivariate-targets, called quantile surfaces (QS)
We present a novel two-stage process: In the first stage, we perform a deterministic point forecast (i.e., central tendency estimation)
Subsequently, we model the prediction uncertainty using QS involving neural networks called quantile surface regression neural networks (QSNN)
We evaluate our novel approach on synthetic data and two currently researched real-world challenges in two different domains: First, probabilistic forecasting for renewable energy power generation, second, short-term cyclists trajectory forecasting for
arXiv Detail & Related papers (2020-09-29T16:35:37Z) - APQ: Joint Search for Network Architecture, Pruning and Quantization
Policy [49.3037538647714]
We present APQ for efficient deep learning inference on resource-constrained hardware.
Unlike previous methods that separately search the neural architecture, pruning policy, and quantization policy, we optimize them in a joint manner.
With the same accuracy, APQ reduces the latency/energy by 2x/1.3x over MobileNetV2+HAQ.
arXiv Detail & Related papers (2020-06-15T16:09:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.