Comparing merging behaviors observed in naturalistic data with behaviors
generated by a machine learned model
- URL: http://arxiv.org/abs/2104.10496v1
- Date: Wed, 21 Apr 2021 12:31:29 GMT
- Title: Comparing merging behaviors observed in naturalistic data with behaviors
generated by a machine learned model
- Authors: Aravinda Ramakrishnan Srinivasan, Mohamed Hasan, Yi-Shin Lin, Matteo
Leonetti, Jac Billington, Richard Romano, Gustav Markkula
- Abstract summary: We study highway driving as an example scenario, and introduce metrics to quantitatively demonstrate the presence of two familiar behavioral phenomena.
Applying the exact same metrics to the output of a state-of-the-art machine-learned model, we show that the model is capable of reproducing the former phenomenon, but not the latter.
- Score: 4.879725885276143
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: There is quickly growing literature on machine-learned models that predict
human driving trajectories in road traffic. These models focus their learning
on low-dimensional error metrics, for example average distance between
model-generated and observed trajectories. Such metrics permit relative
comparison of models, but do not provide clearly interpretable information on
how close to human behavior the models actually come, for example in terms of
higher-level behavior phenomena that are known to be present in human driving.
We study highway driving as an example scenario, and introduce metrics to
quantitatively demonstrate the presence, in a naturalistic dataset, of two
familiar behavioral phenomena: (1) The kinematics-dependent contest, between
on-highway and on-ramp vehicles, of who passes the merging point first. (2)
Courtesy lane changes away from the outermost lane, to leave space for a
merging vehicle. Applying the exact same metrics to the output of a
state-of-the-art machine-learned model, we show that the model is capable of
reproducing the former phenomenon, but not the latter. We argue that this type
of behavioral analysis provides information that is not available from
conventional model-fitting metrics, and that it may be useful to analyze (and
possibly fit) models also based on these types of behavioral criteria.
Related papers
- Causal Estimation of Memorisation Profiles [58.20086589761273]
Understanding memorisation in language models has practical and societal implications.
Memorisation is the causal effect of training with an instance on the model's ability to predict that instance.
This paper proposes a new, principled, and efficient method to estimate memorisation based on the difference-in-differences design from econometrics.
arXiv Detail & Related papers (2024-06-06T17:59:09Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Using Models Based on Cognitive Theory to Predict Human Behavior in
Traffic: A Case Study [4.705182901389292]
We investigate the usefulness of a novel cognitively plausible model for predicting human behavior in gap acceptance scenarios.
We show that this model can compete with or even outperform well-established data-driven prediction models.
arXiv Detail & Related papers (2023-05-24T14:27:00Z) - Benchmark for Models Predicting Human Behavior in Gap Acceptance
Scenarios [4.801975818473341]
We develop a framework facilitating the evaluation of any model, by any metric, and in any scenario.
We then apply this framework to state-of-the-art prediction models, which all show themselves to be unreliable in the most safety-critical situations.
arXiv Detail & Related papers (2022-11-10T09:59:38Z) - Are Neural Topic Models Broken? [81.15470302729638]
We study the relationship between automated and human evaluation of topic models.
We find that neural topic models fare worse in both respects compared to an established classical method.
arXiv Detail & Related papers (2022-10-28T14:38:50Z) - IDM-Follower: A Model-Informed Deep Learning Method for Long-Sequence
Car-Following Trajectory Prediction [24.94160059351764]
Most car-following models are generative and only consider the inputs of the speed, position, and acceleration of the last time step.
We implement a novel structure with two independent encoders and a self-attention decoder that could sequentially predict the following trajectories.
Numerical experiments with multiple settings on simulation and NGSIM datasets show that the IDM-Follower can improve the prediction performance.
arXiv Detail & Related papers (2022-10-20T02:24:27Z) - Beyond RMSE: Do machine-learned models of road user interaction produce
human-like behavior? [12.378231329297137]
We introduce quantitative metrics to demonstrate presence of three different behavioral phenomena in a naturalistic highway driving dataset.
We analyze the behavior of three machine-learned models using the same metrics.
arXiv Detail & Related papers (2022-06-22T14:04:39Z) - You Mostly Walk Alone: Analyzing Feature Attribution in Trajectory
Prediction [52.442129609979794]
Recent deep learning approaches for trajectory prediction show promising performance.
It remains unclear which features such black-box models actually learn to use for making predictions.
This paper proposes a procedure that quantifies the contributions of different cues to model performance.
arXiv Detail & Related papers (2021-10-11T14:24:15Z) - Why do classifier accuracies show linear trends under distribution
shift? [58.40438263312526]
accuracies of models on one data distribution are approximately linear functions of the accuracies on another distribution.
We assume the probability that two models agree in their predictions is higher than what we can infer from their accuracy levels alone.
We show that a linear trend must occur when evaluating models on two distributions unless the size of the distribution shift is large.
arXiv Detail & Related papers (2020-12-31T07:24:30Z) - To what extent do human explanations of model behavior align with actual
model behavior? [91.67905128825402]
We investigated the extent to which human-generated explanations of models' inference decisions align with how models actually make these decisions.
We defined two alignment metrics that quantify how well natural language human explanations align with model sensitivity to input words.
We find that a model's alignment with human explanations is not predicted by the model's accuracy on NLI.
arXiv Detail & Related papers (2020-12-24T17:40:06Z) - Evaluation metrics for behaviour modeling [2.616915680939834]
We propose and investigate metrics for evaluating and comparing generative models of behavior learned using imitation learning.
These criteria look at longer temporal relationships in behavior, are relevant if behavior has some properties that are inherently unpredictable, and highlight biases in the overall distribution of behaviors produced by the model.
We show that the proposed metrics correspond with biologists' intuition about behavior, and allow us to evaluate models, understand their biases, and enable us to propose new research directions.
arXiv Detail & Related papers (2020-07-23T23:47:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.