Tradeoffs in Streaming Binary Classification under Limited Inspection
Resources
- URL: http://arxiv.org/abs/2110.02403v1
- Date: Tue, 5 Oct 2021 23:23:11 GMT
- Title: Tradeoffs in Streaming Binary Classification under Limited Inspection
Resources
- Authors: Parisa Hassanzadeh, Danial Dervovic, Samuel Assefa, Prashant Reddy,
Manuela Veloso
- Abstract summary: We consider an imbalanced binary classification problem, where events arrive sequentially and only a limited number of suspicious events can be inspected.
We analytically characterize the tradeoff between the minority-class detection rate and the inspection capacity.
We implement the selection methods on a real public fraud detection dataset and compare the empirical results with analytical bounds.
- Score: 14.178224954581069
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Institutions are increasingly relying on machine learning models to identify
and alert on abnormal events, such as fraud, cyber attacks and system failures.
These alerts often need to be manually investigated by specialists. Given the
operational cost of manual inspections, the suspicious events are selected by
alerting systems with carefully designed thresholds. In this paper, we consider
an imbalanced binary classification problem, where events arrive sequentially
and only a limited number of suspicious events can be inspected. We model the
event arrivals as a non-homogeneous Poisson process, and compare various
suspicious event selection methods including those based on static and adaptive
thresholds. For each method, we analytically characterize the tradeoff between
the minority-class detection rate and the inspection capacity as a function of
the data class imbalance and the classifier confidence score densities. We
implement the selection methods on a real public fraud detection dataset and
compare the empirical results with analytical bounds. Finally, we investigate
how class imbalance and the choice of classifier impact the tradeoff.
Related papers
- An Adversarial Approach to Evaluating the Robustness of Event Identification Models [12.862865254507179]
This paper considers a physics-based modal decomposition method to extract features for event classification.
The resulting classifiers are tested against an adversarial algorithm to evaluate their robustness.
arXiv Detail & Related papers (2024-02-19T18:11:37Z) - Explainable Fraud Detection with Deep Symbolic Classification [4.1205832766381985]
We present Deep Classification, an extension of the Deep Symbolic Regression framework to classification problems.
Because the functions are mathematical expressions that are in closed-form and concise, the model is inherently explainable both at the level of a single classification decision and the model's decision process.
An evaluation on the PaySim data set demonstrates competitive predictive performance with state-of-the-art models, while surpassing them in terms of explainability.
arXiv Detail & Related papers (2023-12-01T13:50:55Z) - Probabilistic Safety Regions Via Finite Families of Scalable Classifiers [2.431537995108158]
Supervised classification recognizes patterns in the data to separate classes of behaviours.
Canonical solutions contain misclassification errors that are intrinsic to the numerical approximating nature of machine learning.
We introduce the concept of probabilistic safety region to describe a subset of the input space in which the number of misclassified instances is probabilistically controlled.
arXiv Detail & Related papers (2023-09-08T22:40:19Z) - How adversarial attacks can disrupt seemingly stable accurate classifiers [76.95145661711514]
Adversarial attacks dramatically change the output of an otherwise accurate learning system using a seemingly inconsequential modification to a piece of input data.
Here, we show that this may be seen as a fundamental feature of classifiers working with high dimensional input data.
We introduce a simple generic and generalisable framework for which key behaviours observed in practical systems arise with high probability.
arXiv Detail & Related papers (2023-09-07T12:02:00Z) - On the Universal Adversarial Perturbations for Efficient Data-free
Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs.
Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z) - Anomaly Detection using Ensemble Classification and Evidence Theory [62.997667081978825]
We present a novel approach for novel detection using ensemble classification and evidence theory.
A pool selection strategy is presented to build a solid ensemble classifier.
We use uncertainty for the anomaly detection approach.
arXiv Detail & Related papers (2022-12-23T00:50:41Z) - Credit card fraud detection - Classifier selection strategy [0.0]
Using a sample of annotated transactions, a machine learning classification algorithm learns to detect frauds.
fraud data sets are diverse and exhibit inconsistent characteristics.
We propose a data-driven classifier selection strategy for characteristic highly imbalanced fraud detection data sets.
arXiv Detail & Related papers (2022-08-25T07:13:42Z) - Abuse and Fraud Detection in Streaming Services Using Heuristic-Aware
Machine Learning [0.45880283710344055]
This work presents a fraud and abuse detection framework for streaming services by modeling user streaming behavior.
We study the use of semi-supervised as well as supervised approaches for anomaly detection.
To the best of our knowledge, this is the first paper to use machine learning methods for fraud and abuse detection in real-world scale streaming services.
arXiv Detail & Related papers (2022-03-04T03:57:58Z) - Tracking the risk of a deployed model and detecting harmful distribution
shifts [105.27463615756733]
In practice, it may make sense to ignore benign shifts, under which the performance of a deployed model does not degrade substantially.
We argue that a sensible method for firing off a warning has to both (a) detect harmful shifts while ignoring benign ones, and (b) allow continuous monitoring of model performance without increasing the false alarm rate.
arXiv Detail & Related papers (2021-10-12T17:21:41Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z) - Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks.
We present a unifying view of randomized smoothing over arbitrary functions.
We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.