Related papers: Android Malware Detection using Feature Ranking of Permissions

Android Malware Detection using Feature Ranking of Permissions

URL: http://arxiv.org/abs/2201.08468v1
Date: Thu, 20 Jan 2022 22:08:20 GMT
Title: Android Malware Detection using Feature Ranking of Permissions
Authors: Muhammad Suleman Saleem, Jelena Mi\v{s}i\'c, and Vojislav B. Mi\v{s}i\'c
Abstract summary: We use Android permissions as a vehicle to allow for quick and effective differentiation between benign and malware apps. Our analysis indicates that this approach can result in better accuracy and F-score value than other reported approaches.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We investigate the use of Android permissions as the vehicle to allow for quick and effective differentiation between benign and malware apps. To this end, we extract all Android permissions, eliminating those that have zero impact, and apply two feature ranking algorithms namely Chi-Square test and Fisher's Exact test to rank and additionally filter them, resulting in a comparatively small set of relevant permissions. Then we use Decision Tree, Support Vector Machine, and Random Forest Classifier algorithms to detect malware apps. Our analysis indicates that this approach can result in better accuracy and F-score value than other reported approaches. In particular, when random forest is used as the classifier with the combination of Fisher's Exact test, we achieve 99.34\% in accuracy and 92.17\% in F-score with the false positive rate of 0.56\% for the dataset in question, with results improving to 99.82\% in accuracy and 95.28\% in F-score with the false positive rate as low as 0.05\% when only malware from three most popular malware families are considered.

Related papers

DWFS-Obfuscation: Dynamic Weighted Feature Selection for Robust Malware Familial Classification under Obfuscation [1.4742348415878401]
We propose a dynamic weighted feature selection method that analyzes the importance and stability of features. We then utilize graph neural networks for classification, thereby improving the robustness and accuracy of the detection system. Experiments demonstrate that our proposed method achieves an F1-score of 95.56% on the unobfuscated dataset and 92.28% on the obfuscated dataset.
arXiv Detail & Related papers (2025-04-10T09:37:43Z)
Towards a Trustworthy Anomaly Detection for Critical Applications through Approximated Partial AUC Loss [2.09942566943801]
A binary classifier is trained to optimize the specific range of the AUC ROC curve that prevents the True Positive Rate (TPR) to reach 100% while minimizing the False Positive Rate (FPR) The results show a TPR of 92.52% at a 20.43% FPR for an average across 6 datasets, representing a TPR improvement of 4.3% for a FPR cost of 12.2% against other state-of-the-art methods.
arXiv Detail & Related papers (2025-02-17T08:59:59Z)
Leveraging Large Language Models for Cybersecurity: Enhancing SMS Spam Detection with Robust and Context-Aware Text Classification [4.281580125566764]
This study evaluates the effectiveness of different feature extraction techniques and classification algorithms in detecting spam messages within SMS data. We found that TF-IDF, when paired with Naive Bayes, Support Vector Machines, or Deep Neural Networks, provides the most reliable performance.
arXiv Detail & Related papers (2025-02-16T06:39:36Z)
Certified Robustness Under Bounded Levenshtein Distance [55.54271307451233]
We propose the first method for computing the Lipschitz constant of convolutional classifiers with respect to the Levenshtein distance. Our method, LipsLev, is able to obtain $38.80$% and $13.93$% verified accuracy at distance $1$ and $2$ respectively.
arXiv Detail & Related papers (2025-01-23T13:58:53Z)
The Effectiveness of Edge Detection Evaluation Metrics for Automated Coastline Detection [2.5311562666866494]
We evaluate RMSE, PSNR, SSIM and FOM for automated coastline detection. We apply Canny edge detection to 95 coastline satellite images across 49 testing locations. FOM was the most reliable metric for selecting the best threshold.
arXiv Detail & Related papers (2024-05-19T09:51:10Z)
Leveraging Large Language Models to Detect npm Malicious Packages [4.479741014073169]
This study empirically studies the effectiveness of Large Language Models (LLMs) in detecting malicious code. We present SocketAI, a malicious code review workflow to detect malicious code.
arXiv Detail & Related papers (2024-03-18T19:10:12Z)
Small Effect Sizes in Malware Detection? Make Harder Train/Test Splits! [51.668411293817464]
Industry practitioners care about small improvements in malware detection accuracy because their models are deployed to hundreds of millions of machines. Academic research is often restrained to public datasets on the order of ten thousand samples. We devise an approach to generate a benchmark of difficulty from a pool of available samples.
arXiv Detail & Related papers (2023-12-25T21:25:55Z)
Towards a Fair Comparison and Realistic Design and Evaluation Framework of Android Malware Detectors [63.75363908696257]
We analyze 10 influential research works on Android malware detection using a common evaluation framework. We identify five factors that, if not taken into account when creating datasets and designing detectors, significantly affect the trained ML models. We conclude that the studied ML-based detectors have been evaluated optimistically, which justifies the good published results.
arXiv Detail & Related papers (2022-05-25T08:28:08Z)
Mate! Are You Really Aware? An Explainability-Guided Testing Framework for Robustness of Malware Detectors [49.34155921877441]
We propose an explainability-guided and model-agnostic testing framework for robustness of malware detectors. We then use this framework to test several state-of-the-art malware detectors' abilities to detect manipulated malware. Our findings shed light on the limitations of current malware detectors, as well as how they can be improved.
arXiv Detail & Related papers (2021-11-19T08:02:38Z)
Leveraging Uncertainty for Improved Static Malware Detection Under Extreme False Positive Constraints [21.241478970181912]
We show how ensembling and Bayesian treatments of machine learning methods for static malware detection allow for improved identification of model errors. In particular, we improve the true positive rate (TPR) at an actual realized FPR of 1e-5 from an expected 0.69 for previous methods to 0.80 on the best performing model class on the Sophos industry scale dataset.
arXiv Detail & Related papers (2021-08-09T14:30:23Z)
Dynamic detection of mobile malware using smartphone data and machine learning [0.0]
Mobile malware are malicious programs that target mobile devices. Number of active smartphone users is expected to grow, stressing the importance of research on the detection of mobile malware. In this paper, we provide an overview of the performance of machine learning (ML) techniques to detect malware on Android, without using privileged access.
arXiv Detail & Related papers (2021-07-23T12:33:14Z)
Identification of Significant Permissions for Efficient Android Malware Detection [2.179313476241343]
One out of every five business/industry mobile application leaks sensitive personal data. Traditional signature/heuristic-based malware detection systems are unable to cope up with current malware challenges. We propose an efficient Android malware detection system using machine learning and deep neural network.
arXiv Detail & Related papers (2021-02-28T22:07:08Z)
Real-Time Anomaly Detection in Edge Streams [49.26098240310257]
We propose MIDAS, which focuses on detecting microcluster anomalies, or suddenly arriving groups of suspiciously similar edges. We further propose MIDAS-F, to solve the problem by which anomalies are incorporated into the algorithm's internal states. Experiments show that MIDAS-F has significantly higher accuracy than MIDAS.
arXiv Detail & Related papers (2020-09-17T17:59:27Z)
Maat: Automatically Analyzing VirusTotal for Accurate Labeling and Effective Malware Detection [71.84087757644708]
The malware analysis and detection research community relies on the online platform VirusTotal to label Android apps based on the scan results of around 60 scanners. There are no standards on how to best interpret the scan results acquired from VirusTotal, which leads to the utilization of different threshold-based labeling strategies. We implemented a method, Maat, that tackles these issues of standardization and sustainability by automatically generating a Machine Learning (ML)-based labeling scheme.
arXiv Detail & Related papers (2020-07-01T14:15:03Z)
Robust Spammer Detection by Nash Reinforcement Learning [64.80986064630025]
We develop a minimax game where the spammers and spam detectors compete with each other on their practical goals. We show that an optimization algorithm can reliably find an equilibrial detector that can robustly prevent spammers with any mixed spamming strategies from attaining their practical goal.
arXiv Detail & Related papers (2020-06-10T21:18:07Z)
Phishing URL Detection Through Top-level Domain Analysis: A Descriptive Approach [3.494620587853103]
This study aims to develop a machine-learning model to detect fraudulent URLs which can be used within the Splunk platform. Inspired from similar approaches in the literature, we trained the SVM and Random Forests algorithms using malicious and benign datasets. We evaluated the algorithms' performance with precision and recall, reaching up to 85% precision and 87% recall in the case of Random Forests.
arXiv Detail & Related papers (2020-05-13T21:41:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.