Anomaly Detection Under Uncertainty Using Distributionally Robust
Optimization Approach
- URL: http://arxiv.org/abs/2312.01296v1
- Date: Sun, 3 Dec 2023 06:13:22 GMT
- Title: Anomaly Detection Under Uncertainty Using Distributionally Robust
Optimization Approach
- Authors: Amir Hossein Noormohammadia, Seyed Ali MirHassania, Farnaz Hooshmand
Khaligh
- Abstract summary: Anomaly detection is defined as the problem of finding data points that do not follow the patterns of the majority.
The one-class Support Vector Machines (SVM) method aims to find a decision boundary to distinguish between normal data points and anomalies.
A distributionally robust chance-constrained model is proposed in which the probability of misclassification is low.
- Score: 0.9217021281095907
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Anomaly detection is defined as the problem of finding data points that do
not follow the patterns of the majority. Among the various proposed methods for
solving this problem, classification-based methods, including one-class Support
Vector Machines (SVM) are considered effective and state-of-the-art. The
one-class SVM method aims to find a decision boundary to distinguish between
normal data points and anomalies using only the normal data. On the other hand,
most real-world problems involve some degree of uncertainty, where the true
probability distribution of each data point is unknown, and estimating it is
often difficult and costly. Assuming partial distribution information such as
the first and second-order moments is known, a distributionally robust
chance-constrained model is proposed in which the probability of
misclassification is low. By utilizing a mapping function to a higher
dimensional space, the proposed model will be capable of classifying
origin-inseparable datasets. Also, by adopting the kernel idea, the need for
explicitly knowing the mapping is eliminated, computations can be performed in
the input space, and computational complexity is reduced. Computational results
validate the robustness of the proposed model under different probability
distributions and also the superiority of the proposed model compared to the
standard one-class SVM in terms of various evaluation metrics.
Related papers
- Bayesian Estimation and Tuning-Free Rank Detection for Probability Mass Function Tensors [17.640500920466984]
This paper presents a novel framework for estimating the joint PMF and automatically inferring its rank from observed data.
We derive a deterministic solution based on variational inference (VI) to approximate the posterior distributions of various model parameters. Additionally, we develop a scalable version of the VI-based approach by leveraging variational inference (SVI)
Experiments involving both synthetic data and real movie recommendation data illustrate the advantages of our VI and SVI-based methods in terms of estimation accuracy, automatic rank detection, and computational efficiency.
arXiv Detail & Related papers (2024-10-08T20:07:49Z) - A Mallows-like Criterion for Anomaly Detection with Random Forest Implementation [7.569443648362081]
This paper proposes a novel criterion to select the weights on aggregation of multiple models, wherein the focal loss function accounts for the classification of extremely imbalanced data.
We have evaluated the proposed method on benchmark datasets across various domains, including network intrusion.
arXiv Detail & Related papers (2024-05-29T09:36:57Z) - Cost-sensitive probabilistic predictions for support vector machines [1.743685428161914]
Support vector machines (SVMs) are widely used and constitute one of the best examined and used machine learning models.
We propose a novel approach to generate probabilistic outputs for the SVM.
arXiv Detail & Related papers (2023-10-09T11:00:17Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Marginalization in Bayesian Networks: Integrating Exact and Approximate
Inference [0.0]
Missing data and hidden variables require calculating the marginal probability distribution of a subset of the variables.
We develop a divide-and-conquer approach using the graphical properties of Bayesian networks.
We present an efficient and scalable algorithm for estimating the marginal probability distribution for categorical variables.
arXiv Detail & Related papers (2021-12-16T21:49:52Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Model-based clustering of partial records [11.193504036335503]
We develop clustering methodology through a model-based approach using the marginal density for the observed values.
We compare our algorithm to the corresponding full expectation-maximization (EM) approach that considers the missing values in the incomplete data set.
Simulation studies demonstrate that our approach has favorable recovery of the true cluster partition compared to case deletion and imputation.
arXiv Detail & Related papers (2021-03-30T13:30:59Z) - Amortized Conditional Normalized Maximum Likelihood: Reliable Out of
Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation.
Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle.
We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z) - Identification of Probability weighted ARX models with arbitrary domains [75.91002178647165]
PieceWise Affine models guarantees universal approximation, local linearity and equivalence to other classes of hybrid system.
In this work, we focus on the identification of PieceWise Auto Regressive with eXogenous input models with arbitrary regions (NPWARX)
The architecture is conceived following the Mixture of Expert concept, developed within the machine learning field.
arXiv Detail & Related papers (2020-09-29T12:50:33Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.