Related papers: Interpretable Machine Learning for Detection and Classification of Ransomware Families Based on API Calls

Interpretable Machine Learning for Detection and Classification of Ransomware Families Based on API Calls

URL: http://arxiv.org/abs/2210.11235v1
Date: Sun, 16 Oct 2022 15:54:45 GMT
Title: Interpretable Machine Learning for Detection and Classification of Ransomware Families Based on API Calls
Authors: Rawshan Ara Mowri, Madhuri Siddula, Kaushik Roy
Abstract summary: This research work utilizes the frequencies of different API calls to detect and classify ransomware families. A WebCrawler is developed to automate collecting the Windows Portable Executable PE files of 15 different ransomware families. Logistic Regression can efficiently classify ransomware into their corresponding families securing 9915 accuracy.
Score: 5.340730281227837
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Ransomware has appeared as one of the major global threats in recent days The alarming increasing rate of ransomware attacks and new ransomware variants intrigue the researchers to constantly examine the distinguishing traits of ransomware and refine their detection strategies Application Programming Interface API is a way for one program to collaborate with another API calls are the medium by which they communicate Ransomware uses this strategy to interact with the OS and makes a significantly higher number of calls in different sequences to ask for taking action This research work utilizes the frequencies of different API calls to detect and classify ransomware families First a WebCrawler is developed to automate collecting the Windows Portable Executable PE files of 15 different ransomware families By extracting different frequencies of 68 API calls we develop our dataset in the first phase of the two phase feature engineering process After selecting the most significant features in the second phase of the feature engineering process we deploy six Supervised Machine Learning models Naive Bayes Logistic Regression Random Forest Stochastic Gradient Descent K Nearest Neighbor and Support Vector Machine Then the performances of all the classifiers are compared to select the best model The results reveal that Logistic Regression can efficiently classify ransomware into their corresponding families securing 9915 accuracy Finally instead of relying on the Black box characteristic of the Machine Learning models we present the interpretability of our best performing model using SHAP values to ascertain the transparency and trustworthiness of the models prediction

Related papers

Every Step Counts: Decoding Trajectories as Authorship Fingerprints of dLLMs [63.82840470917859]
We show that the decoding mechanism of dLLMs can be used as a powerful tool for model attribution.<n>We propose a novel information extraction scheme called the Directed Decoding Map (DDM), which captures structural relationships between decoding steps and better reveals model-specific behaviors.
arXiv Detail & Related papers (2025-10-02T06:25:10Z)
MLRan: A Behavioural Dataset for Ransomware Analysis and Detection [0.7706236363202722]
We introduce MLRan, a behavioural ransomware dataset, comprising over 4,800 samples across 64 ransomware families and a balanced set of goodware samples.<n>The samples span from 2006 to 2024 and encompass the four major types of ransomware: locker, crypto, ransomware-as-a-service, and modern variants.<n>We evaluated the ransomware detection performance of several machine learning (ML) models using MLRan.
arXiv Detail & Related papers (2025-05-24T09:22:53Z)
A Sysmon Incremental Learning System for Ransomware Analysis and Detection [1.495391051525033]
In the face of increasing cyber threats, particularly ransomware attacks, there is a pressing need for advanced detection and analysis systems. Most of these proposals leverage non-incremental learning approaches that require the underlying models to be updated from scratch to detect new ransomware. This approach is problematic because it leaves sensitive data vulnerable to attack during retraining, as newly emerging ransomware strains may go undetected until the model is updated. We present the Sysmon Incremental Learning System for Analysis and Detection (SILRAD), which enables continuous updates to the underlying model and effectively closes the training gap.
arXiv Detail & Related papers (2025-01-02T06:22:58Z)
Zero Day Ransomware Detection with Pulse: Function Classification with Transformer Models and Assembly Language [1.870031206586792]
Peekaboo, a Dynamic Binary Instrumentation tool defeats evasive malware to capture its genuine behavior. We propose Pulse, a novel framework for zero day ransomware detection with Transformer models and Assembly language.
arXiv Detail & Related papers (2024-08-15T00:22:32Z)
Zero-day attack and ransomware detection [0.0]
This study uses the UGRansome dataset to train various Machine Learning models for zero-day and ransomware attacks detection. The finding demonstrates that Random Forest (RFC), XGBoost, and Ensemble Methods achieved perfect scores in accuracy, precision, recall, and F1-score.
arXiv Detail & Related papers (2024-08-08T02:23:42Z)
Few-Shot API Attack Detection: Overcoming Data Scarcity with GAN-Inspired Learning [9.035212370386846]
This paper proposes a novel few-shot detection approach motivated by Natural Language Processing (NLP) and advanced Generative Adrialversa Network (GAN)-inspired techniques. Our method enhances the contextual understanding of API requests, leading to improved anomaly detection compared to traditional methods.
arXiv Detail & Related papers (2024-05-18T11:10:45Z)
Robust Wake-Up Word Detection by Two-stage Multi-resolution Ensembles [48.208214762257136]
It employs two models: a lightweight on-device model for real-time processing of the audio stream and a verification model on the server-side. To protect privacy, audio features are sent to the cloud instead of raw audio.
arXiv Detail & Related papers (2023-10-17T16:22:18Z)
Domain Adaptive Synapse Detection with Weak Point Annotations [63.97144211520869]
We present AdaSyn, a framework for domain adaptive synapse detection with weak point annotations. In the WASPSYN challenge at I SBI 2023, our method ranks the 1st place.
arXiv Detail & Related papers (2023-08-31T05:05:53Z)
Behavioural Reports of Multi-Stage Malware [3.64414368529873]
This dataset provides API call sequences for thousands of malware samples executed in Windows 10 virtual machines. A tutorial on how to create and expand this dataset is provided along with a benchmark demonstrating how to use this dataset to classify malware.
arXiv Detail & Related papers (2023-01-30T11:51:02Z)
Effective Metaheuristic Based Classifiers for Multiclass Intrusion Detection [0.0]
Intrusion detection plays an important role in the security of information systems or networks devices. Having a large amount of data is one of the key problems in detecting attacks. A feature selection method plays a key role to select best features to achieve maximum accuracy.
arXiv Detail & Related papers (2022-10-06T04:56:01Z)
Anomaly Detection in Cybersecurity: Unsupervised, Graph-Based and Supervised Learning Methods in Adversarial Environments [63.942632088208505]
Inherent to today's operating environment is the practice of adversarial machine learning. In this work, we examine the feasibility of unsupervised learning and graph-based methods for anomaly detection. We incorporate a realistic adversarial training mechanism when training our supervised models to enable strong classification performance in adversarial environments.
arXiv Detail & Related papers (2021-05-14T10:05:10Z)
Generalizing Face Forgery Detection with High-frequency Features [63.33397573649408]
Current CNN-based detectors tend to overfit to method-specific color textures and thus fail to generalize. We propose to utilize the high-frequency noises for face forgery detection. The first is the multi-scale high-frequency feature extraction module that extracts high-frequency noises at multiple scales. The second is the residual-guided spatial attention module that guides the low-level RGB feature extractor to concentrate more on forgery traces from a new perspective.
arXiv Detail & Related papers (2021-03-23T08:19:21Z)
Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference [150.07326223077405]
Few-shot learning is attracting much attention to mitigate data scarcity. We present a discriminative nearest neighbor classification with deep self-attention. We propose to boost the discriminative ability by transferring a natural language inference (NLI) model.
arXiv Detail & Related papers (2020-10-25T00:39:32Z)
Adversarial EXEmples: A Survey and Experimental Evaluation of Practical Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes. We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks. These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.