Multi-class Classifier based Failure Prediction with Artificial and
Anonymous Training for Data Privacy
- URL: http://arxiv.org/abs/2209.02275v1
- Date: Tue, 6 Sep 2022 07:53:33 GMT
- Title: Multi-class Classifier based Failure Prediction with Artificial and
Anonymous Training for Data Privacy
- Authors: Dibakar Das, Vikram Seshasai, Vineet Sudhir Bhat, Pushkal Juneja,
Jyotsna Bapat, Debabrata Das
- Abstract summary: A neural network based multi-class classifier is developed for failure prediction.
The proposed mechanism completely decouples the data set used for training process from the actual data which is kept private.
Results show high accuracy in failure prediction under different parameter configurations.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a novel non-intrusive system failure prediction technique
using available information from developers and minimal information from raw
logs (rather than mining entire logs) but keeping the data entirely private
with the data owners. A neural network based multi-class classifier is
developed for failure prediction, using artificially generated anonymous data
set, applying a combination of techniques, viz., genetic algorithm (steps),
pattern repetition, etc., to train and test the network. The proposed mechanism
completely decouples the data set used for training process from the actual
data which is kept private. Moreover, multi-criteria decision making (MCDM)
schemes are used to prioritize failures meeting business requirements. Results
show high accuracy in failure prediction under different parameter
configurations. On a broader context, any classification problem, beyond
failure prediction, can be performed using the proposed mechanism with
artificially generated data set without looking into the actual data as long as
the input features can be translated to binary values (e.g. output from private
binary classifiers) and can provide classification-as-a-service.
Related papers
- Approaching Metaheuristic Deep Learning Combos for Automated Data Mining [0.5419570023862531]
This work proposes a means of combining meta-heuristic methods with conventional classifiers and neural networks in order to perform automated data mining.
Experiments on the MNIST dataset for handwritten digit recognition were performed.
It was empirically observed that using a ground truth labeled dataset's validation accuracy is inadequate for correcting labels of other previously unseen data instances.
arXiv Detail & Related papers (2024-10-16T10:28:22Z) - XAL: EXplainable Active Learning Makes Classifiers Better Low-resource Learners [71.8257151788923]
We propose a novel Explainable Active Learning framework (XAL) for low-resource text classification.
XAL encourages classifiers to justify their inferences and delve into unlabeled data for which they cannot provide reasonable explanations.
Experiments on six datasets show that XAL achieves consistent improvement over 9 strong baselines.
arXiv Detail & Related papers (2023-10-09T08:07:04Z) - Differentially Private Linear Regression with Linked Data [3.9325957466009203]
Differential privacy, a mathematical notion from computer science, is a rising tool offering robust privacy guarantees.
Recent work focuses on developing differentially private versions of individual statistical and machine learning tasks.
We present two differentially private algorithms for linear regression with linked data.
arXiv Detail & Related papers (2023-08-01T21:00:19Z) - PEOPL: Characterizing Privately Encoded Open Datasets with Public Labels [59.66777287810985]
We introduce information-theoretic scores for privacy and utility, which quantify the average performance of an unfaithful user.
We then theoretically characterize primitives in building families of encoding schemes that motivate the use of random deep neural networks.
arXiv Detail & Related papers (2023-03-31T18:03:53Z) - Mutual Information Learned Classifiers: an Information-theoretic
Viewpoint of Training Deep Learning Classification Systems [9.660129425150926]
Cross entropy loss can easily lead us to find models which demonstrate severe overfitting behavior.
In this paper, we prove that the existing cross entropy loss minimization for training DNN classifiers essentially learns the conditional entropy of the underlying data distribution.
We propose a mutual information learning framework where we train DNN classifiers via learning the mutual information between the label and input.
arXiv Detail & Related papers (2022-10-03T15:09:19Z) - Few-Shot Non-Parametric Learning with Deep Latent Variable Model [50.746273235463754]
We propose Non-Parametric learning by Compression with Latent Variables (NPC-LV)
NPC-LV is a learning framework for any dataset with abundant unlabeled data but very few labeled ones.
We show that NPC-LV outperforms supervised methods on all three datasets on image classification in low data regime.
arXiv Detail & Related papers (2022-06-23T09:35:03Z) - Uncertainty-Autoencoder-Based Privacy and Utility Preserving Data Type
Conscious Transformation [3.7315964084413173]
We propose an adversarial learning framework that deals with the privacy-utility tradeoff problem under two conditions.
Under data-type ignorant conditions, the privacy mechanism provides a one-hot encoding of categorical features, representing exactly one class.
Under data-type aware conditions, the categorical variables are represented by a collection of scores, one for each class.
arXiv Detail & Related papers (2022-05-04T08:40:15Z) - Out-distribution aware Self-training in an Open World Setting [62.19882458285749]
We leverage unlabeled data in an open world setting to further improve prediction performance.
We introduce out-distribution aware self-training, which includes a careful sample selection strategy.
Our classifiers are by design out-distribution aware and can thus distinguish task-related inputs from unrelated ones.
arXiv Detail & Related papers (2020-12-21T12:25:04Z) - Graph-Homomorphic Perturbations for Private Decentralized Learning [64.26238893241322]
Local exchange of estimates allows inference of data based on private data.
perturbations chosen independently at every agent, resulting in a significant performance loss.
We propose an alternative scheme, which constructs perturbations according to a particular nullspace condition, allowing them to be invisible.
arXiv Detail & Related papers (2020-10-23T10:35:35Z) - Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier [68.38233199030908]
Long-tail recognition tackles the natural non-uniformly distributed data in realworld scenarios.
While moderns perform well on populated classes, its performance degrades significantly on tail classes.
Deep-RTC is proposed as a new solution to the long-tail problem, combining realism with hierarchical predictions.
arXiv Detail & Related papers (2020-07-20T05:57:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.