Interpretable Machine Learning for Detection and Classification of
Ransomware Families Based on API Calls
- URL: http://arxiv.org/abs/2210.11235v1
- Date: Sun, 16 Oct 2022 15:54:45 GMT
- Title: Interpretable Machine Learning for Detection and Classification of
Ransomware Families Based on API Calls
- Authors: Rawshan Ara Mowri, Madhuri Siddula, Kaushik Roy
- Abstract summary: This research work utilizes the frequencies of different API calls to detect and classify ransomware families.
A WebCrawler is developed to automate collecting the Windows Portable Executable PE files of 15 different ransomware families.
Logistic Regression can efficiently classify ransomware into their corresponding families securing 9915 accuracy.
- Score: 5.340730281227837
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ransomware has appeared as one of the major global threats in recent days The
alarming increasing rate of ransomware attacks and new ransomware variants
intrigue the researchers to constantly examine the distinguishing traits of
ransomware and refine their detection strategies Application Programming
Interface API is a way for one program to collaborate with another API calls
are the medium by which they communicate Ransomware uses this strategy to
interact with the OS and makes a significantly higher number of calls in
different sequences to ask for taking action This research work utilizes the
frequencies of different API calls to detect and classify ransomware families
First a WebCrawler is developed to automate collecting the Windows Portable
Executable PE files of 15 different ransomware families By extracting different
frequencies of 68 API calls we develop our dataset in the first phase of the
two phase feature engineering process After selecting the most significant
features in the second phase of the feature engineering process we deploy six
Supervised Machine Learning models Naive Bayes Logistic Regression Random
Forest Stochastic Gradient Descent K Nearest Neighbor and Support Vector
Machine Then the performances of all the classifiers are compared to select the
best model The results reveal that Logistic Regression can efficiently classify
ransomware into their corresponding families securing 9915 accuracy Finally
instead of relying on the Black box characteristic of the Machine Learning
models we present the interpretability of our best performing model using SHAP
values to ascertain the transparency and trustworthiness of the models
prediction
Related papers
- Zero Day Ransomware Detection with Pulse: Function Classification with Transformer Models and Assembly Language [1.870031206586792]
Peekaboo, a Dynamic Binary Instrumentation tool defeats evasive malware to capture its genuine behavior.
We propose Pulse, a novel framework for zero day ransomware detection with Transformer models and Assembly language.
arXiv Detail & Related papers (2024-08-15T00:22:32Z) - Zero-day attack and ransomware detection [0.0]
This study uses the UGRansome dataset to train various Machine Learning models for zero-day and ransomware attacks detection.
The finding demonstrates that Random Forest (RFC), XGBoost, and Ensemble Methods achieved perfect scores in accuracy, precision, recall, and F1-score.
arXiv Detail & Related papers (2024-08-08T02:23:42Z) - Few-Shot API Attack Detection: Overcoming Data Scarcity with GAN-Inspired Learning [9.035212370386846]
This paper proposes a novel few-shot detection approach motivated by Natural Language Processing (NLP) and advanced Generative Adrialversa Network (GAN)-inspired techniques.
Our method enhances the contextual understanding of API requests, leading to improved anomaly detection compared to traditional methods.
arXiv Detail & Related papers (2024-05-18T11:10:45Z) - Robust Wake-Up Word Detection by Two-stage Multi-resolution Ensembles [48.208214762257136]
It employs two models: a lightweight on-device model for real-time processing of the audio stream and a verification model on the server-side.
To protect privacy, audio features are sent to the cloud instead of raw audio.
arXiv Detail & Related papers (2023-10-17T16:22:18Z) - Domain Adaptive Synapse Detection with Weak Point Annotations [63.97144211520869]
We present AdaSyn, a framework for domain adaptive synapse detection with weak point annotations.
In the WASPSYN challenge at I SBI 2023, our method ranks the 1st place.
arXiv Detail & Related papers (2023-08-31T05:05:53Z) - Behavioural Reports of Multi-Stage Malware [3.64414368529873]
This dataset provides API call sequences for thousands of malware samples executed in Windows 10 virtual machines.
A tutorial on how to create and expand this dataset is provided along with a benchmark demonstrating how to use this dataset to classify malware.
arXiv Detail & Related papers (2023-01-30T11:51:02Z) - Effective Metaheuristic Based Classifiers for Multiclass Intrusion
Detection [0.0]
Intrusion detection plays an important role in the security of information systems or networks devices.
Having a large amount of data is one of the key problems in detecting attacks.
A feature selection method plays a key role to select best features to achieve maximum accuracy.
arXiv Detail & Related papers (2022-10-06T04:56:01Z) - Anomaly Detection in Cybersecurity: Unsupervised, Graph-Based and
Supervised Learning Methods in Adversarial Environments [63.942632088208505]
Inherent to today's operating environment is the practice of adversarial machine learning.
In this work, we examine the feasibility of unsupervised learning and graph-based methods for anomaly detection.
We incorporate a realistic adversarial training mechanism when training our supervised models to enable strong classification performance in adversarial environments.
arXiv Detail & Related papers (2021-05-14T10:05:10Z) - Generalizing Face Forgery Detection with High-frequency Features [63.33397573649408]
Current CNN-based detectors tend to overfit to method-specific color textures and thus fail to generalize.
We propose to utilize the high-frequency noises for face forgery detection.
The first is the multi-scale high-frequency feature extraction module that extracts high-frequency noises at multiple scales.
The second is the residual-guided spatial attention module that guides the low-level RGB feature extractor to concentrate more on forgery traces from a new perspective.
arXiv Detail & Related papers (2021-03-23T08:19:21Z) - Discriminative Nearest Neighbor Few-Shot Intent Detection by
Transferring Natural Language Inference [150.07326223077405]
Few-shot learning is attracting much attention to mitigate data scarcity.
We present a discriminative nearest neighbor classification with deep self-attention.
We propose to boost the discriminative ability by transferring a natural language inference (NLI) model.
arXiv Detail & Related papers (2020-10-25T00:39:32Z) - Adversarial EXEmples: A Survey and Experimental Evaluation of Practical
Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes.
We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks.
These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.