Comparing hundreds of machine learning classifiers and discrete choice
models in predicting travel behavior: an empirical benchmark
- URL: http://arxiv.org/abs/2102.01130v1
- Date: Mon, 1 Feb 2021 19:45:47 GMT
- Title: Comparing hundreds of machine learning classifiers and discrete choice
models in predicting travel behavior: an empirical benchmark
- Authors: Shenhao Wang, Baichuan Mo, Stephane Hess, Jinhua Zhao
- Abstract summary: This study seeks to provide a generalizable empirical benchmark by comparing hundreds of machine learning (ML) and discrete choice models (DCMs)
Experiments evaluate both prediction accuracy and computational cost by spanning four hyper-dimensions.
Deep neural networks achieve the highest predictive performance, but at a relatively high computational cost.
- Score: 3.0969191504482247
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Researchers have compared machine learning (ML) classifiers and discrete
choice models (DCMs) in predicting travel behavior, but the generalizability of
the findings is limited by the specifics of data, contexts, and authors'
expertise. This study seeks to provide a generalizable empirical benchmark by
comparing hundreds of ML and DCM classifiers in a highly structured manner. The
experiments evaluate both prediction accuracy and computational cost by
spanning four hyper-dimensions, including 105 ML and DCM classifiers from 12
model families, 3 datasets, 3 sample sizes, and 3 outputs. This experimental
design leads to an immense number of 6,970 experiments, which are corroborated
with a meta dataset of 136 experiment points from 35 previous studies. This
study is hitherto the most comprehensive and almost exhaustive comparison of
the classifiers for travel behavioral prediction. We found that the ensemble
methods and deep neural networks achieve the highest predictive performance,
but at a relatively high computational cost. Random forests are the most
computationally efficient, balancing between prediction and computation. While
discrete choice models offer accuracy with only 3-4 percentage points lower
than the top ML classifiers, they have much longer computational time and
become computationally impossible with large sample size, high input
dimensions, or simulation-based estimation. The relative ranking of the ML and
DCM classifiers is highly stable, while the absolute values of the prediction
accuracy and computational time have large variations. Overall, this paper
suggests using deep neural networks, model ensembles, and random forests as
baseline models for future travel behavior prediction. For choice modeling, the
DCM community should switch more attention from fitting models to improving
computational efficiency, so that the DCMs can be widely adopted in the big
data context.
Related papers
- Model Provenance Testing for Large Language Models [14.949325775620439]
We develop a framework for testing model provenance: Whether one model is derived from another.
Our approach is based on the key observation that real-world model derivations preserve significant similarities in model outputs.
Using only black-box access to models, we employ multiple hypothesis testing to compare model similarities against a baseline established by unrelated models.
arXiv Detail & Related papers (2025-02-02T07:39:37Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Comparing Foundation Models using Data Kernels [13.099029073152257]
We present a methodology for directly comparing the embedding space geometry of foundation models.
Our methodology is grounded in random graph theory and enables valid hypothesis testing of embedding similarity.
We show how our framework can induce a manifold of models equipped with a distance function that correlates strongly with several downstream metrics.
arXiv Detail & Related papers (2023-05-09T02:01:07Z) - A prediction and behavioural analysis of machine learning methods for
modelling travel mode choice [0.26249027950824505]
We conduct a systematic comparison of different modelling approaches, across multiple modelling problems, in terms of the key factors likely to affect model choice.
Results indicate that the models with the highest disaggregate predictive performance provide poorer estimates of behavioural indicators and aggregate mode shares.
It is also observed that the MNL model performs robustly in a variety of situations, though ML techniques can improve the estimates of behavioural indices such as Willingness to Pay.
arXiv Detail & Related papers (2023-01-11T11:10:32Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Empirical Analysis of Model Selection for Heterogeneous Causal Effect Estimation [24.65301562548798]
We study the problem of model selection in causal inference, specifically for conditional average treatment effect (CATE) estimation.
We conduct an empirical analysis to benchmark the surrogate model selection metrics introduced in the literature, as well as the novel ones introduced in this work.
arXiv Detail & Related papers (2022-11-03T16:26:06Z) - Investigating Ensemble Methods for Model Robustness Improvement of Text
Classifiers [66.36045164286854]
We analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases.
By choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.
arXiv Detail & Related papers (2022-10-28T17:52:10Z) - On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z) - Deep Learning Models for Knowledge Tracing: Review and Empirical
Evaluation [2.423547527175807]
We review and evaluate a body of deep learning knowledge tracing (DLKT) models with openly available and widely-used data sets.
The evaluated DLKT models have been reimplemented for assessing and replicability of previously reported results.
arXiv Detail & Related papers (2021-12-30T14:19:27Z) - A comprehensive comparative evaluation and analysis of Distributional
Semantic Models [61.41800660636555]
We perform a comprehensive evaluation of type distributional vectors, either produced by static DSMs or obtained by averaging the contextualized vectors generated by BERT.
The results show that the alleged superiority of predict based models is more apparent than real, and surely not ubiquitous.
We borrow from cognitive neuroscience the methodology of Representational Similarity Analysis (RSA) to inspect the semantic spaces generated by distributional models.
arXiv Detail & Related papers (2021-05-20T15:18:06Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.