Information Leakage Detection through Approximate Bayes-optimal
Prediction
- URL: http://arxiv.org/abs/2401.14283v1
- Date: Thu, 25 Jan 2024 16:15:27 GMT
- Title: Information Leakage Detection through Approximate Bayes-optimal
Prediction
- Authors: Pritha Gupta, Marcel Wever, and Eyke H\"ullermeier
- Abstract summary: Information leakage (IL) raises security concerns in today's data-driven world.
Conventional statistical approaches, which estimate mutual information (MI) between observable and secret information for detecting IL, face challenges such as dimensionality, convergence, computational complexity, and MI misestimation.
We establish a theoretical framework using statistical learning theory and information theory to accurately and quantify IL.
- Score: 5.23890938002044
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In today's data-driven world, the proliferation of publicly available
information intensifies the challenge of information leakage (IL), raising
security concerns. IL involves unintentionally exposing secret (sensitive)
information to unauthorized parties via systems' observable information.
Conventional statistical approaches, which estimate mutual information (MI)
between observable and secret information for detecting IL, face challenges
such as the curse of dimensionality, convergence, computational complexity, and
MI misestimation. Furthermore, emerging supervised machine learning (ML)
methods, though effective, are limited to binary system-sensitive information
and lack a comprehensive theoretical framework. To address these limitations,
we establish a theoretical framework using statistical learning theory and
information theory to accurately quantify and detect IL. We demonstrate that MI
can be accurately estimated by approximating the log-loss and accuracy of the
Bayes predictor. As the Bayes predictor is typically unknown in practice, we
propose to approximate it with the help of automated machine learning (AutoML).
First, we compare our MI estimation approaches against current baselines, using
synthetic data sets generated using the multivariate normal (MVN) distribution
with known MI. Second, we introduce a cut-off technique using one-sided
statistical tests to detect IL, employing the Holm-Bonferroni correction to
increase confidence in detection decisions. Our study evaluates IL detection
performance on real-world data sets, highlighting the effectiveness of the
Bayes predictor's log-loss estimation, and finds our proposed method to
effectively estimate MI on synthetic data sets and thus detect ILs accurately.
Related papers
- Uncertainty, Calibration, and Membership Inference Attacks: An
Information-Theoretic Perspective [46.08491133624608]
We analyze the performance of the state-of-the-art likelihood ratio attack (LiRA) within an information-theoretical framework.
We derive bounds on the advantage of an MIA adversary with the aim of offering insights into the impact of uncertainty and calibration on the effectiveness of MIAs.
arXiv Detail & Related papers (2024-02-16T13:41:18Z) - Dynamic Model Agnostic Reliability Evaluation of Machine-Learning
Methods Integrated in Instrumentation & Control Systems [1.8978726202765634]
Trustworthiness of datadriven neural network-based machine learning algorithms is not adequately assessed.
In recent reports by the National Institute for Standards and Technology, trustworthiness in ML is a critical barrier to adoption.
We demonstrate a real-time model-agnostic method to evaluate the relative reliability of ML predictions by incorporating out-of-distribution detection on the training dataset.
arXiv Detail & Related papers (2023-08-08T18:25:42Z) - Applied Bayesian Structural Health Monitoring: inclinometer data anomaly
detection and forecasting [0.0]
Inclinometer probes are devices that can be used to measure deformations within earthwork slopes.
This paper demonstrates a novel application of Bayesian techniques to real-world inclinometer data.
arXiv Detail & Related papers (2023-07-01T11:28:43Z) - LMD: Light-weight Prediction Quality Estimation for Object Detection in
Lidar Point Clouds [3.927702899922668]
Object detection on Lidar point cloud data is a promising technology for autonomous driving and robotics.
Uncertainty estimation is a crucial component for down-stream tasks and deep neural networks remain error-prone even for predictions with high confidence.
We propose LidarMetaDetect, a light-weight post-processing scheme for prediction quality estimation.
Our experiments show a significant increase of statistical reliability in separating true from false predictions.
arXiv Detail & Related papers (2023-06-13T15:13:29Z) - Uncertainty Estimation by Fisher Information-based Evidential Deep
Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications.
We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL)
In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Multiple Imputation via Generative Adversarial Network for
High-dimensional Blockwise Missing Value Problems [6.123324869194195]
We propose Multiple Imputation via Generative Adversarial Network (MI-GAN), a deep learning-based (in specific, a GAN-based) multiple imputation method.
MI-GAN shows strong performance matching existing state-of-the-art imputation methods on high-dimensional datasets.
In particular, MI-GAN significantly outperforms other imputation methods in the sense of statistical inference and computational speed.
arXiv Detail & Related papers (2021-12-21T20:19:37Z) - Tight Mutual Information Estimation With Contrastive Fenchel-Legendre
Optimization [69.07420650261649]
We introduce a novel, simple, and powerful contrastive MI estimator named as FLO.
Empirically, our FLO estimator overcomes the limitations of its predecessors and learns more efficiently.
The utility of FLO is verified using an extensive set of benchmarks, which also reveals the trade-offs in practical MI estimation.
arXiv Detail & Related papers (2021-07-02T15:20:41Z) - Statistical control for spatio-temporal MEG/EEG source imaging with
desparsified multi-task Lasso [102.84915019938413]
Non-invasive techniques like magnetoencephalography (MEG) or electroencephalography (EEG) offer promise of non-invasive techniques.
The problem of source localization, or source imaging, poses however a high-dimensional statistical inference challenge.
We propose an ensemble of desparsified multi-task Lasso (ecd-MTLasso) to deal with this problem.
arXiv Detail & Related papers (2020-09-29T21:17:16Z) - Estimating Structural Target Functions using Machine Learning and
Influence Functions [103.47897241856603]
We propose a new framework for statistical machine learning of target functions arising as identifiable functionals from statistical models.
This framework is problem- and model-agnostic and can be used to estimate a broad variety of target parameters of interest in applied statistics.
We put particular focus on so-called coarsening at random/doubly robust problems with partially unobserved information.
arXiv Detail & Related papers (2020-08-14T16:48:29Z) - Bayesian Optimization with Machine Learning Algorithms Towards Anomaly
Detection [66.05992706105224]
In this paper, an effective anomaly detection framework is proposed utilizing Bayesian Optimization technique.
The performance of the considered algorithms is evaluated using the ISCX 2012 dataset.
Experimental results show the effectiveness of the proposed framework in term of accuracy rate, precision, low-false alarm rate, and recall.
arXiv Detail & Related papers (2020-08-05T19:29:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.