Ransomware detection using stacked autoencoder for feature selection
- URL: http://arxiv.org/abs/2402.11342v1
- Date: Sat, 17 Feb 2024 17:31:48 GMT
- Title: Ransomware detection using stacked autoencoder for feature selection
- Authors: Mike Nkongolo and Mahmut Tokmak
- Abstract summary: The study meticulously analyzes the autoencoder's learned weights and activations to identify essential features for distinguishing ransomware families from other malware.
The proposed model achieves an exceptional 99% accuracy in ransomware classification, surpassing the Extreme Gradient Boosting (XGBoost) algorithm.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The aim of this study is to propose and evaluate an advanced ransomware
detection and classification method that combines a Stacked Autoencoder (SAE)
for precise feature selection with a Long Short Term Memory (LSTM) classifier
to enhance ransomware stratification accuracy. The proposed approach involves
thorough pre processing of the UGRansome dataset and training an unsupervised
SAE for optimal feature selection or fine tuning via supervised learning to
elevate the LSTM model's classification capabilities. The study meticulously
analyzes the autoencoder's learned weights and activations to identify
essential features for distinguishing ransomware families from other malware
and creates a streamlined feature set for precise classification. Extensive
experiments, including up to 400 epochs and varying learning rates, are
conducted to optimize the model's performance. The results demonstrate the
outstanding performance of the SAE-LSTM model across all ransomware families,
boasting high precision, recall, and F1 score values that underscore its robust
classification capabilities. Furthermore, balanced average scores affirm the
proposed model's ability to generalize effectively across various malware
types. The proposed model achieves an exceptional 99% accuracy in ransomware
classification, surpassing the Extreme Gradient Boosting (XGBoost) algorithm
primarily due to its effective SAE feature selection mechanism. The model also
demonstrates outstanding performance in identifying signature attacks,
achieving a 98% accuracy rate.
Related papers
- Self-DenseMobileNet: A Robust Framework for Lung Nodule Classification using Self-ONN and Stacking-based Meta-Classifier [1.2300841481611335]
Self-DenseMobileNet is designed to enhance the classification of nodules and non-nodules in chest radiographs (CXRs)
Our framework integrates advanced image standardization and enhancement techniques to optimize the input quality.
When tested on an external dataset, the framework maintained strong generalizability with an accuracy of 89.40%.
arXiv Detail & Related papers (2024-10-16T14:04:06Z) - Confidence-aware Contrastive Learning for Selective Classification [20.573658672018066]
This work provides a generalization bound for selective classification, disclosing that optimizing feature layers helps improve the performance of selective classification.
Inspired by this theory, we propose to explicitly improve the selective classification model at the feature level for the first time, leading to a novel Confidence-aware Contrastive Learning method for Selective Classification, CCL-SC.
arXiv Detail & Related papers (2024-06-07T08:43:53Z) - A comparative study on feature selection for a risk prediction model for
colorectal cancer [0.0]
This work is focused on colorectal cancer, assessing several feature ranking algorithms in terms of performance for a set of risk prediction models.
A visual approach proposed in this work allows to see that the Neural Network-based wrapper ranking is the most unstable while the Random Forest is the most stable.
arXiv Detail & Related papers (2024-02-07T22:14:14Z) - Embedded feature selection in LSTM networks with multi-objective
evolutionary ensemble learning for time series forecasting [49.1574468325115]
We present a novel feature selection method embedded in Long Short-Term Memory networks.
Our approach optimize the weights and biases of the LSTM in a partitioned manner.
Experimental evaluations on air quality time series data from Italy and southeast Spain demonstrate that our method substantially improves the ability generalization of conventional LSTMs.
arXiv Detail & Related papers (2023-12-29T08:42:10Z) - Stacking an autoencoder for feature selection of zero-day threats [0.0]
This study explores the application of stacked autoencoder (SAE), a type of artificial neural network, for feature selection and zero-day threat classification.
The learned weights and activations of the autoencoder are analyzed to identify the most important features for discriminating between zero-day threats and normal system behavior.
The results indicate that the SAE-LSTM performs well across all three attack categories by showcasing high precision, recall, and F1 score values.
arXiv Detail & Related papers (2023-11-01T05:29:42Z) - An Evaluation of Machine Learning Approaches for Early Diagnosis of
Autism Spectrum Disorder [0.0]
Autistic Spectrum Disorder (ASD) is a neurological disease characterized by difficulties with social interaction, communication, and repetitive activities.
This study employs diverse machine learning methods to identify crucial ASD traits, aiming to enhance and automate the diagnostic process.
arXiv Detail & Related papers (2023-09-20T21:23:37Z) - Characterizing the Optimal 0-1 Loss for Multi-class Classification with
a Test-time Attacker [57.49330031751386]
We find achievable information-theoretic lower bounds on loss in the presence of a test-time attacker for multi-class classifiers on any discrete dataset.
We provide a general framework for finding the optimal 0-1 loss that revolves around the construction of a conflict hypergraph from the data and adversarial constraints.
arXiv Detail & Related papers (2023-02-21T15:17:13Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Meta-learning One-class Classifiers with Eigenvalue Solvers for
Supervised Anomaly Detection [55.888835686183995]
We propose a neural network-based meta-learning method for supervised anomaly detection.
We experimentally demonstrate that the proposed method achieves better performance than existing anomaly detection and few-shot learning methods.
arXiv Detail & Related papers (2021-03-01T01:43:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.