Related papers: Multi-class Classifier based Failure Prediction with Artificial and Anonymous Training for Data Privacy

Multi-class Classifier based Failure Prediction with Artificial and Anonymous Training for Data Privacy

URL: http://arxiv.org/abs/2209.02275v1
Date: Tue, 6 Sep 2022 07:53:33 GMT
Title: Multi-class Classifier based Failure Prediction with Artificial and Anonymous Training for Data Privacy
Authors: Dibakar Das, Vikram Seshasai, Vineet Sudhir Bhat, Pushkal Juneja, Jyotsna Bapat, Debabrata Das
Abstract summary: A neural network based multi-class classifier is developed for failure prediction. The proposed mechanism completely decouples the data set used for training process from the actual data which is kept private. Results show high accuracy in failure prediction under different parameter configurations.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper proposes a novel non-intrusive system failure prediction technique using available information from developers and minimal information from raw logs (rather than mining entire logs) but keeping the data entirely private with the data owners. A neural network based multi-class classifier is developed for failure prediction, using artificially generated anonymous data set, applying a combination of techniques, viz., genetic algorithm (steps), pattern repetition, etc., to train and test the network. The proposed mechanism completely decouples the data set used for training process from the actual data which is kept private. Moreover, multi-criteria decision making (MCDM) schemes are used to prioritize failures meeting business requirements. Results show high accuracy in failure prediction under different parameter configurations. On a broader context, any classification problem, beyond failure prediction, can be performed using the proposed mechanism with artificially generated data set without looking into the actual data as long as the input features can be translated to binary values (e.g. output from private binary classifiers) and can provide classification-as-a-service.

Related papers

Benchmarking Fraud Detectors on Private Graph Data [70.4654745317714]
Currently, many types of fraud are managed in part by automated detection algorithms that operate over graphs.<n>We consider the scenario where a data holder wishes to outsource development of fraud detectors to third parties.<n>Third parties submit their fraud detectors to the data holder, who evaluates these algorithms on a private dataset and then publicly communicates the results.<n>We propose a realistic privacy attack on this system that allows an adversary to de-anonymize individuals' data based only on the evaluation results.
arXiv Detail & Related papers (2025-07-30T03:20:15Z)
Private Training & Data Generation by Clustering Embeddings [74.00687214400021]
Differential privacy (DP) provides a robust framework for protecting individual data.<n>We introduce a novel principled method for DP synthetic image embedding generation.<n> Empirically, a simple two-layer neural network trained on synthetically generated embeddings achieves state-of-the-art (SOTA) classification accuracy.
arXiv Detail & Related papers (2025-06-20T00:17:14Z)
Approaching Metaheuristic Deep Learning Combos for Automated Data Mining [0.5419570023862531]
This work proposes a means of combining meta-heuristic methods with conventional classifiers and neural networks in order to perform automated data mining. Experiments on the MNIST dataset for handwritten digit recognition were performed. It was empirically observed that using a ground truth labeled dataset's validation accuracy is inadequate for correcting labels of other previously unseen data instances.
arXiv Detail & Related papers (2024-10-16T10:28:22Z)
XAL: EXplainable Active Learning Makes Classifiers Better Low-resource Learners [71.8257151788923]
We propose a novel Explainable Active Learning framework (XAL) for low-resource text classification. XAL encourages classifiers to justify their inferences and delve into unlabeled data for which they cannot provide reasonable explanations. Experiments on six datasets show that XAL achieves consistent improvement over 9 strong baselines.
arXiv Detail & Related papers (2023-10-09T08:07:04Z)
Differentially Private Linear Regression with Linked Data [3.9325957466009203]
Differential privacy, a mathematical notion from computer science, is a rising tool offering robust privacy guarantees. Recent work focuses on developing differentially private versions of individual statistical and machine learning tasks. We present two differentially private algorithms for linear regression with linked data.
arXiv Detail & Related papers (2023-08-01T21:00:19Z)
PEOPL: Characterizing Privately Encoded Open Datasets with Public Labels [59.66777287810985]
We introduce information-theoretic scores for privacy and utility, which quantify the average performance of an unfaithful user. We then theoretically characterize primitives in building families of encoding schemes that motivate the use of random deep neural networks.
arXiv Detail & Related papers (2023-03-31T18:03:53Z)
Mutual Information Learned Classifiers: an Information-theoretic Viewpoint of Training Deep Learning Classification Systems [9.660129425150926]
Cross entropy loss can easily lead us to find models which demonstrate severe overfitting behavior. In this paper, we prove that the existing cross entropy loss minimization for training DNN classifiers essentially learns the conditional entropy of the underlying data distribution. We propose a mutual information learning framework where we train DNN classifiers via learning the mutual information between the label and input.
arXiv Detail & Related papers (2022-10-03T15:09:19Z)
Few-Shot Non-Parametric Learning with Deep Latent Variable Model [50.746273235463754]
We propose Non-Parametric learning by Compression with Latent Variables (NPC-LV) NPC-LV is a learning framework for any dataset with abundant unlabeled data but very few labeled ones. We show that NPC-LV outperforms supervised methods on all three datasets on image classification in low data regime.
arXiv Detail & Related papers (2022-06-23T09:35:03Z)
Uncertainty-Autoencoder-Based Privacy and Utility Preserving Data Type Conscious Transformation [3.7315964084413173]
We propose an adversarial learning framework that deals with the privacy-utility tradeoff problem under two conditions. Under data-type ignorant conditions, the privacy mechanism provides a one-hot encoding of categorical features, representing exactly one class. Under data-type aware conditions, the categorical variables are represented by a collection of scores, one for each class.
arXiv Detail & Related papers (2022-05-04T08:40:15Z)
Out-distribution aware Self-training in an Open World Setting [62.19882458285749]
We leverage unlabeled data in an open world setting to further improve prediction performance. We introduce out-distribution aware self-training, which includes a careful sample selection strategy. Our classifiers are by design out-distribution aware and can thus distinguish task-related inputs from unrelated ones.
arXiv Detail & Related papers (2020-12-21T12:25:04Z)
Graph-Homomorphic Perturbations for Private Decentralized Learning [64.26238893241322]
Local exchange of estimates allows inference of data based on private data. perturbations chosen independently at every agent, resulting in a significant performance loss. We propose an alternative scheme, which constructs perturbations according to a particular nullspace condition, allowing them to be invisible.
arXiv Detail & Related papers (2020-10-23T10:35:35Z)
Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier [68.38233199030908]
Long-tail recognition tackles the natural non-uniformly distributed data in realworld scenarios. While moderns perform well on populated classes, its performance degrades significantly on tail classes. Deep-RTC is proposed as a new solution to the long-tail problem, combining realism with hierarchical predictions.
arXiv Detail & Related papers (2020-07-20T05:57:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.