Statistical Modeling of Data Breach Risks: Time to Identification and
Notification
- URL: http://arxiv.org/abs/2209.07306v2
- Date: Sat, 24 Sep 2022 15:19:04 GMT
- Title: Statistical Modeling of Data Breach Risks: Time to Identification and
Notification
- Authors: Maochao Xu and Quynh Nhu Nguyen
- Abstract summary: We propose a novel approach to imputing the missing data, and further develop a dependence model to capture the complex pattern exhibited by those two metrics.
The empirical study shows that the proposed approach has a satisfactory predictive performance and is superior to other commonly used models.
- Score: 2.132096006921048
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It is very challenging to predict the cost of a cyber incident owing to the
complex nature of cyber risk. However, it is inevitable for insurance companies
who offer cyber insurance policies. The time to identifying an incident and the
time to noticing the affected individuals are two important components in
determining the cost of a cyber incident. In this work, we initialize the study
on those two metrics via statistical modeling approaches. Particularly, we
propose a novel approach to imputing the missing data, and further develop a
dependence model to capture the complex pattern exhibited by those two metrics.
The empirical study shows that the proposed approach has a satisfactory
predictive performance and is superior to other commonly used models.
Related papers
- Cyber Risk Taxonomies: Statistical Analysis of Cybersecurity Risk Classifications [0.0]
We argue in favour of switching the attention from goodness-of-fit and in-sample performance, to focusing on the out-of sample forecasting performance.
Our results indicate that business motivated cyber risk classifications appear to be too restrictive and not flexible enough to capture the heterogeneity of cyber risk events.
arXiv Detail & Related papers (2024-10-04T04:12:34Z) - Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - A Data-Driven Predictive Analysis on Cyber Security Threats with Key Risk Factors [1.715270928578365]
This paper exhibits a Machine Learning(ML) based model for predicting individuals who may be victims of cyber attacks by analyzing socioeconomic factors.
We propose a novel Pertinent Features Random Forest (RF) model, which achieved maximum accuracy with 20 features (95.95%)
We generated 10 important association rules and presented the framework that is rigorously evaluated on real-world datasets.
arXiv Detail & Related papers (2024-03-28T09:41:24Z) - Reconciling AI Performance and Data Reconstruction Resilience for
Medical Imaging [52.578054703818125]
Artificial Intelligence (AI) models are vulnerable to information leakage of their training data, which can be highly sensitive.
Differential Privacy (DP) aims to circumvent these susceptibilities by setting a quantifiable privacy budget.
We show that using very large privacy budgets can render reconstruction attacks impossible, while drops in performance are negligible.
arXiv Detail & Related papers (2023-12-05T12:21:30Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Designing an attack-defense game: how to increase robustness of
financial transaction models via a competition [69.08339915577206]
Given the escalating risks of malicious attacks in the finance sector, understanding adversarial strategies and robust defense mechanisms for machine learning models is critical.
We aim to investigate the current state and dynamics of adversarial attacks and defenses for neural network models that use sequential financial data as the input.
We have designed a competition that allows realistic and detailed investigation of problems in modern financial transaction data.
The participants compete directly against each other, so possible attacks and defenses are examined in close-to-real-life conditions.
arXiv Detail & Related papers (2023-08-22T12:53:09Z) - An engine to simulate insurance fraud network data [1.3812010983144802]
We develop a simulation machine that is engineered to create synthetic data with a network structure.
We can specify the total number of policyholders and parties, the desired level of imbalance and the (effect size of the) features in the fraud generating model.
The simulation engine enables researchers and practitioners to examine several methodological challenges as well as to test their (development strategy of) insurance fraud detection models.
arXiv Detail & Related papers (2023-08-21T13:14:00Z) - Safe AI for health and beyond -- Monitoring to transform a health
service [51.8524501805308]
We will assess the infrastructure required to monitor the outputs of a machine learning algorithm.
We will present two scenarios with examples of monitoring and updates of models.
arXiv Detail & Related papers (2023-03-02T17:27:45Z) - SurvivalGAN: Generating Time-to-Event Data for Survival Analysis [121.84429525403694]
Imbalances in censoring and time horizons cause generative models to experience three new failure modes specific to survival analysis.
We propose SurvivalGAN, a generative model that handles survival data by addressing the imbalance in the censoring and event horizons.
We evaluate this method via extensive experiments on medical datasets.
arXiv Detail & Related papers (2023-02-24T17:03:51Z) - A robust statistical framework for cyber-vulnerability prioritisation under partial information in threat intelligence [0.0]
This work introduces a robust statistical framework for quantitative and qualitative reasoning under uncertainty about cyber-vulnerabilities.
We identify a novel accuracy measure suited for rank in variance under partial knowledge of the whole set of existing vulnerabilities.
We discuss the implications of partial knowledge about cyber-vulnerabilities on threat intelligence and decision-making in operational scenarios.
arXiv Detail & Related papers (2023-02-16T15:05:43Z) - Modeling Multivariate Cyber Risks: Deep Learning Dating Extreme Value
Theory [6.451038884092264]
The proposed model enjoys the high accurate point predictions via deep learning and high quantile prediction via extreme value theory.
The empirical evidence based on real honeypot attack data also shows that the proposed model has very satisfactory prediction performances.
arXiv Detail & Related papers (2021-03-15T15:18:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.