Hierarchical Decision Ensembles- An inferential framework for uncertain
Human-AI collaboration in forensic examinations
- URL: http://arxiv.org/abs/2111.01131v1
- Date: Sun, 31 Oct 2021 08:07:43 GMT
- Title: Hierarchical Decision Ensembles- An inferential framework for uncertain
Human-AI collaboration in forensic examinations
- Authors: Ganesh Krishnan, Heike Hofmann
- Abstract summary: We present an inferential framework for assessing the model and its output.
The framework is designed to calibrate trust in forensic experts by bridging the gap between domain specific knowledge and predictive model results.
- Score: 0.8122270502556371
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Forensic examination of evidence like firearms and toolmarks, traditionally
involves a visual and therefore subjective assessment of similarity of two
questioned items. Statistical models are used to overcome this subjectivity and
allow specification of error rates. These models are generally quite complex
and produce abstract results at different levels of the analysis. Presenting
such metrics and complicated results to examiners is challenging, as examiners
generally do not have substantial statistical training to accurately interpret
results. This creates distrust in statistical modelling and lowers the rate of
acceptance of more objective measures that the discipline at large is striving
for. We present an inferential framework for assessing the model and its
output. The framework is designed to calibrate trust in forensic experts by
bridging the gap between domain specific knowledge and predictive model
results, allowing forensic examiners to validate the claims of the predictive
model while critically assessing results.
Related papers
- The BRAVO Semantic Segmentation Challenge Results in UNCV2024 [68.20197719071436]
We define two categories of reliability: (1) semantic reliability, which reflects the model's accuracy and calibration when exposed to various perturbations; and (2) OOD reliability, which measures the model's ability to detect object classes that are unknown during training.
The results reveal interesting insights into the importance of large-scale pre-training and minimal architectural design in developing robust and reliable semantic segmentation models.
arXiv Detail & Related papers (2024-09-23T15:17:30Z) - A View on Out-of-Distribution Identification from a Statistical Testing Theory Perspective [0.24578723416255752]
We study the problem of efficiently detecting Out-of-Distribution (OOD) samples at test time in supervised and unsupervised learning contexts.
We re-formulate the OOD problem under the lenses of statistical testing and then discuss conditions that render the OOD problem identifiable in statistical terms.
arXiv Detail & Related papers (2024-05-05T21:06:07Z) - Reliability and Interpretability in Science and Deep Learning [0.0]
This article focuses on the comparison between traditional scientific models and Deep Neural Network (DNN) models.
It argues that the high complexity of DNN models hinders the estimate of their reliability and also their prospect of long-term progress.
It also clarifies how interpretability is a precondition for assessing the reliability of any model, which cannot be based on statistical analysis alone.
arXiv Detail & Related papers (2024-01-14T20:14:07Z) - A Comprehensive Evaluation and Analysis Study for Chinese Spelling Check [53.152011258252315]
We show that using phonetic and graphic information reasonably is effective for Chinese Spelling Check.
Models are sensitive to the error distribution of the test set, which reflects the shortcomings of models.
The commonly used benchmark, SIGHAN, can not reliably evaluate models' performance.
arXiv Detail & Related papers (2023-07-25T17:02:38Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Fairness Increases Adversarial Vulnerability [50.90773979394264]
This paper shows the existence of a dichotomy between fairness and robustness, and analyzes when achieving fairness decreases the model robustness to adversarial samples.
Experiments on non-linear models and different architectures validate the theoretical findings in multiple vision domains.
The paper proposes a simple, yet effective, solution to construct models achieving good tradeoffs between fairness and robustness.
arXiv Detail & Related papers (2022-11-21T19:55:35Z) - A Comparative Study of Faithfulness Metrics for Model Interpretability
Methods [3.7200349581269996]
We introduce two assessment dimensions, namely diagnosticity and time complexity.
According to the experimental results, we find that sufficiency and comprehensiveness metrics have higher diagnosticity and lower time complexity than the other faithfulness metric.
arXiv Detail & Related papers (2022-04-12T04:02:17Z) - Unveiling Project-Specific Bias in Neural Code Models [20.131797671630963]
Large Language Models (LLMs) based neural code models often struggle to generalize effectively to real-world inter-project out-of-distribution (OOD) data.
We show that this phenomenon is caused by the heavy reliance on project-specific shortcuts for prediction instead of ground-truth evidence.
We propose a novel bias mitigation mechanism that regularizes the model's learning behavior by leveraging latent logic relations among samples.
arXiv Detail & Related papers (2022-01-19T02:09:48Z) - Causality and Generalizability: Identifiability and Learning Methods [0.0]
This thesis contributes to the research areas concerning the estimation of causal effects, causal structure learning, and distributionally robust prediction methods.
We present novel and consistent linear and non-linear causal effects estimators in instrumental variable settings that employ data-dependent mean squared prediction error regularization.
We propose a general framework for distributional robustness with respect to intervention-induced distributions.
arXiv Detail & Related papers (2021-10-04T13:12:11Z) - A comprehensive comparative evaluation and analysis of Distributional
Semantic Models [61.41800660636555]
We perform a comprehensive evaluation of type distributional vectors, either produced by static DSMs or obtained by averaging the contextualized vectors generated by BERT.
The results show that the alleged superiority of predict based models is more apparent than real, and surely not ubiquitous.
We borrow from cognitive neuroscience the methodology of Representational Similarity Analysis (RSA) to inspect the semantic spaces generated by distributional models.
arXiv Detail & Related papers (2021-05-20T15:18:06Z) - Characterizing Fairness Over the Set of Good Models Under Selective
Labels [69.64662540443162]
We develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance.
We provide tractable algorithms to compute the range of attainable group-level predictive disparities.
We extend our framework to address the empirically relevant challenge of selectively labelled data.
arXiv Detail & Related papers (2021-01-02T02:11:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.