Modeling Household Online Shopping Demand in the U.S.: A Machine
Learning Approach and Comparative Investigation between 2009 and 2017
- URL: http://arxiv.org/abs/2101.03690v1
- Date: Mon, 11 Jan 2021 03:45:53 GMT
- Title: Modeling Household Online Shopping Demand in the U.S.: A Machine
Learning Approach and Comparative Investigation between 2009 and 2017
- Authors: Limon Barua, Bo Zou, Yan (Joann) Zhou, Yulin Liu
- Abstract summary: This paper leverages two recent releases of the U.S. National Household Travel Survey (NHTS) data for 2009 and 2017 to develop machine learning (ML) models for predicting household-level online shopping purchases.
Two latest advances in machine learning techniques, namely Shapley value-based feature importance and Accumulated Local Effects plots, are adopted to overcome inherent drawbacks of the popular techniques in current ML modeling.
The models developed and insights gained can be used for online shopping-related freight demand generation and may also be considered for evaluating the potential impact of relevant policies on online shopping demand.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the rapid growth of online shopping and research interest in the
relationship between online and in-store shopping, national-level modeling and
investigation of the demand for online shopping with a prediction focus remain
limited in the literature. This paper differs from prior work and leverages two
recent releases of the U.S. National Household Travel Survey (NHTS) data for
2009 and 2017 to develop machine learning (ML) models, specifically gradient
boosting machine (GBM), for predicting household-level online shopping
purchases. The NHTS data allow for not only conducting nationwide investigation
but also at the level of households, which is more appropriate than at the
individual level given the connected consumption and shopping needs of members
in a household. We follow a systematic procedure for model development
including employing Recursive Feature Elimination algorithm to select input
variables (features) in order to reduce the risk of model overfitting and
increase model explainability. Extensive post-modeling investigation is
conducted in a comparative manner between 2009 and 2017, including quantifying
the importance of each input variable in predicting online shopping demand, and
characterizing value-dependent relationships between demand and the input
variables. In doing so, two latest advances in machine learning techniques,
namely Shapley value-based feature importance and Accumulated Local Effects
plots, are adopted to overcome inherent drawbacks of the popular techniques in
current ML modeling. The modeling and investigation are performed both at the
national level and for three of the largest cities (New York, Los Angeles, and
Houston). The models developed and insights gained can be used for online
shopping-related freight demand generation and may also be considered for
evaluating the potential impact of relevant policies on online shopping demand.
Related papers
- MetaTrading: An Immersion-Aware Model Trading Framework for Vehicular Metaverse Services [94.61039892220037]
We present a novel immersion-aware model trading framework that incentivizes metaverse users (MUs) to contribute learning models for augmented reality (AR) services in the vehicular metaverse.
Considering dynamic network conditions and privacy concerns, we formulate the reward decisions of MSPs as a multi-agent Markov decision process.
Experimental results demonstrate that the proposed framework can effectively provide higher-value models for object detection and classification in AR services on real AR-related vehicle datasets.
arXiv Detail & Related papers (2024-10-25T16:20:46Z) - F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - Modeling the Telemarketing Process using Genetic Algorithms and Extreme
Boosting: Feature Selection and Cost-Sensitive Analytical Approach [0.06906005491572399]
This research aims at leveraging the power of telemarketing data in modeling the willingness of clients to make a term deposit.
Real-world data from a Portuguese bank and national socio-economic metrics are used to model the telemarketing decision-making process.
arXiv Detail & Related papers (2023-10-30T08:46:55Z) - Refined Mechanism Design for Approximately Structured Priors via Active
Regression [50.71772232237571]
We consider the problem of a revenue-maximizing seller with a large number of items for sale to $n$ strategic bidders.
It is well-known that optimal and even approximately-optimal mechanisms for this setting are notoriously difficult to characterize or compute.
arXiv Detail & Related papers (2023-10-11T20:34:17Z) - HireVAE: An Online and Adaptive Factor Model Based on Hierarchical and
Regime-Switch VAE [113.47287249524008]
It is still an open question to build a factor model that can conduct stock prediction in an online and adaptive setting.
We propose the first deep learning based online and adaptive factor model, HireVAE, at the core of which is a hierarchical latent space that embeds the relationship between the market situation and stock-wise latent factors.
Across four commonly used real stock market benchmarks, the proposed HireVAE demonstrate superior performance in terms of active returns over previous methods.
arXiv Detail & Related papers (2023-06-05T12:58:13Z) - Modelling the Frequency of Home Deliveries: An Induced Travel Demand
Contribution of Aggrandized E-shopping in Toronto during COVID-19 Pandemics [2.5380150390265257]
This study developed models to predict household' weekly home delivery frequencies.
It is found that socioeconomic factors such as having an online grocery membership, household members' average age, the percentage of male household members, the number of workers in the household and various land use factors influence home delivery demand.
arXiv Detail & Related papers (2022-09-21T21:18:25Z) - Analyzing Machine Learning Models for Credit Scoring with Explainable AI
and Optimizing Investment Decisions [0.0]
This paper examines two different yet related questions related to explainable AI (XAI) practices.
The study compares various machine learning models, including single classifiers (logistic regression, decision trees, LDA, QDA), heterogeneous ensembles (AdaBoost, Random Forest), and sequential neural networks.
Two advanced post-hoc model explainability techniques - LIME and SHAP are utilized to assess ML-based credit scoring models.
arXiv Detail & Related papers (2022-09-19T21:44:42Z) - Augmented Bilinear Network for Incremental Multi-Stock Time-Series
Classification [83.23129279407271]
We propose a method to efficiently retain the knowledge available in a neural network pre-trained on a set of securities.
In our method, the prior knowledge encoded in a pre-trained neural network is maintained by keeping existing connections fixed.
This knowledge is adjusted for the new securities by a set of augmented connections, which are optimized using the new data.
arXiv Detail & Related papers (2022-07-23T18:54:10Z) - Machine learning applications for electricity market agent-based models:
A systematic literature review [68.8204255655161]
Agent-based simulations are used to better understand the dynamics of the electricity market.
Agent-based models provide the opportunity to integrate machine learning and artificial intelligence.
We review 55 papers published between 2016 and 2021 which focus on machine learning applied to agent-based electricity market models.
arXiv Detail & Related papers (2022-06-05T14:52:26Z) - Heterogeneous Network Embedding for Deep Semantic Relevance Match in
E-commerce Search [29.881612817309716]
We design an end-to-end First-and-Second-order Relevance prediction model for e-commerce item relevance.
We introduce external knowledge generated from BERT to refine the network of user behaviors.
Results of offline experiments showed that the new model significantly improved the prediction accuracy in terms of human relevance judgment.
arXiv Detail & Related papers (2021-01-13T03:12:53Z) - A Probabilistic Simulator of Spatial Demand for Product Allocation [23.430521524442195]
In this paper, we propose a model of spatial demand in physical retail.
We show that the proposed model is more predictive of demand than existing baselines.
We also perform a preliminary study into different automation techniques and show that an optimal product allocation policy can be learned through Deep Q-Learning.
arXiv Detail & Related papers (2020-01-09T20:18:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.