Enabling CMF Estimation in Data-Constrained Scenarios: A
Semantic-Encoding Knowledge Mining Model
- URL: http://arxiv.org/abs/2311.08690v1
- Date: Wed, 15 Nov 2023 04:37:27 GMT
- Title: Enabling CMF Estimation in Data-Constrained Scenarios: A
Semantic-Encoding Knowledge Mining Model
- Authors: Yanlin Qi, Jia Li, Michael Zhang
- Abstract summary: This study introduces a novel knowledge-mining framework for CMF prediction.
It delves into the connections of existing countermeasures and reduces the reliance of CMF estimation on crash data availability.
It effectively encodes unstructured countermeasure scenarios into machine-readable representations and models the complex relationships between scenarios and CMF values.
- Score: 23.367637547929807
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Precise estimation of Crash Modification Factors (CMFs) is central to
evaluating the effectiveness of various road safety treatments and prioritizing
infrastructure investment accordingly. While customized study for each
countermeasure scenario is desired, the conventional CMF estimation approaches
rely heavily on the availability of crash data at given sites. This not only
makes the estimation costly, but the results are also less transferable, since
the intrinsic similarities between different safety countermeasure scenarios
are not fully explored. Aiming to fill this gap, this study introduces a novel
knowledge-mining framework for CMF prediction. This framework delves into the
connections of existing countermeasures and reduces the reliance of CMF
estimation on crash data availability and manual data collection. Specifically,
it draws inspiration from human comprehension processes and introduces advanced
Natural Language Processing (NLP) techniques to extract intricate variations
and patterns from existing CMF knowledge. It effectively encodes unstructured
countermeasure scenarios into machine-readable representations and models the
complex relationships between scenarios and CMF values. This new data-driven
framework provides a cost-effective and adaptable solution that complements the
case-specific approaches for CMF estimation, which is particularly beneficial
when availability of crash data or time imposes constraints. Experimental
validation using real-world CMF Clearinghouse data demonstrates the
effectiveness of this new approach, which shows significant accuracy
improvements compared to baseline methods. This approach provides insights into
new possibilities of harnessing accumulated transportation knowledge in various
applications.
Related papers
- Amortized Bayesian Multilevel Models [9.831471158899644]
Multilevel models (MLMs) are a central building block of the Bayesian workflow.
MLMs pose significant computational challenges, often rendering their estimation and evaluation intractable within reasonable time constraints.
Recent advances in simulation-based inference offer promising solutions for addressing complex probabilistic models using deep generative networks.
We explore a family of neural network architectures that leverage the probabilistic factorization of multilevel models to facilitate efficient neural network training and subsequent near-instant posterior inference on unseen data sets.
arXiv Detail & Related papers (2024-08-23T17:11:04Z) - Federated Continual Learning Goes Online: Uncertainty-Aware Memory Management for Vision Tasks and Beyond [13.867793835583463]
We propose an uncertainty-aware memory-based approach to solve catastrophic forgetting.
We retrieve samples with specific characteristics, and - by retraining the model on such samples - we demonstrate the potential of this approach.
arXiv Detail & Related papers (2024-05-29T09:29:39Z) - Enabling Quartile-based Estimated-Mean Gradient Aggregation As Baseline
for Federated Image Classifications [5.5099914877576985]
Federated Learning (FL) has revolutionized how we train deep neural networks by enabling decentralized collaboration while safeguarding sensitive data and improving model performance.
This paper introduces an innovative solution named Estimated Mean Aggregation (EMA) that not only addresses these challenges but also provides a fundamental reference point as a $mathsfbaseline$ for advanced aggregation techniques in FL systems.
arXiv Detail & Related papers (2023-09-21T17:17:28Z) - Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients.
FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification.
Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z) - Uncertainty Estimation by Fisher Information-based Evidential Deep
Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications.
We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL)
In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Impact of Channel Variation on One-Class Learning for Spoof Detection [5.549602650463701]
Spoofing detection increases the reliability of the ASV system but degrades significantly due to channel variation.
"Which data-feeding strategy is optimal for MCT?" is not known in the case of spoof detection.
This study highlights the relevance of the deemed-of-low-importance process of data-feeding and mini-batching to raise awareness of the need to refine it for better performance.
arXiv Detail & Related papers (2021-09-30T07:56:16Z) - Entropy-based adaptive design for contour finding and estimating
reliability [0.24466725954625884]
In reliability analysis, methods used to estimate failure probability are often limited by the costs associated with model evaluations.
We introduce an entropy-based GP adaptive design that, when paired with MFIS, provides more accurate failure probability estimates.
Illustrative examples are provided on benchmark data as well as an application to an impact damage simulator for National Aeronautics and Space Administration (NASA) spacesuits.
arXiv Detail & Related papers (2021-05-24T15:41:15Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.