An explainable Recursive Feature Elimination to detect Advanced Persistent Threats using Random Forest classifier
- URL: http://arxiv.org/abs/2511.09603v1
- Date: Fri, 14 Nov 2025 01:01:17 GMT
- Title: An explainable Recursive Feature Elimination to detect Advanced Persistent Threats using Random Forest classifier
- Authors: Noor Hazlina Abdul Mutalib, Aznul Qalid Md Sabri, Ainuddin Wahid Abdul Wahab, Erma Rahayu Mohd Faizal Abdullah, Nouar AlDahoul,
- Abstract summary: We propose an explainable intrusion detection framework that integrates Recursive Feature Elimination (RFE) with Random Forest (RF) to enhance detection of Advanced Persistent Threats (APTs)<n>Our experiment demonstrates that the explainable RF-RFE achieved a detection accuracy of 99.9%, reducing false positive and computational cost in comparison to traditional classifiers.
- Score: 1.565870461096057
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Intrusion Detection Systems (IDS) play a vital role in modern cybersecurity frameworks by providing a primary defense mechanism against sophisticated threat actors. In this paper, we propose an explainable intrusion detection framework that integrates Recursive Feature Elimination (RFE) with Random Forest (RF) to enhance detection of Advanced Persistent Threats (APTs). By using CICIDS2017 dataset, the approach begins with comprehensive data preprocessing and narrows down the most significant features via RFE. A Random Forest (RF) model was trained on the refined feature set, with SHapley Additive exPlanations (SHAP) used to interpret the contribution of each selected feature. Our experiment demonstrates that the explainable RF-RFE achieved a detection accuracy of 99.9%, reducing false positive and computational cost in comparison to traditional classifiers. The findings underscore the effectiveness of integrating explainable AI and feature selection to develop a robust, transparent, and deployable IDS solution.
Related papers
- Source-Free Object Detection with Detection Transformer [59.33653163035064]
Source-Free Object Detection (SFOD) enables knowledge transfer from a source domain to an unsupervised target domain for object detection without access to source data.<n>Most existing SFOD approaches are either confined to conventional object detection (OD) models like Faster R-CNN or designed as general solutions without tailored adaptations for novel OD architectures, especially Detection Transformer (DETR)<n>In this paper, we introduce Feature Reweighting ANd Contrastive Learning NetworK (FRANCK), a novel SFOD framework specifically designed to perform query-centric feature enhancement for DETRs.
arXiv Detail & Related papers (2025-10-13T07:35:04Z) - Ensemble of Precision-Recall Curve (PRC) Classification Trees with Autoencoders [3.060720241524644]
Anomaly detection underpins critical applications from network security to intrusion detection to fraud prevention.<n>To combat the former, we previously introduced Precision-Recall Curve (PRC) classification trees and their ensemble extension, the PRC Random Forest (PRC-RF)<n>Building on that foundation, we now propose a hybrid framework that integrates PRC-RF with autoencoders, unsupervised machine learning methods that learn compact latent representations.
arXiv Detail & Related papers (2025-09-06T16:39:22Z) - An Uncertainty-aware DETR Enhancement Framework for Object Detection [10.102900613370817]
We propose an uncertainty-aware enhancement framework for DETR-based object detectors.<n>We derive a Bayes Risk formulation to filter high-risk information and improve detection reliability.<n> Experiments on the COCO benchmark show that our method can be effectively integrated into existing DETR variants.
arXiv Detail & Related papers (2025-07-20T07:53:04Z) - Random Forest Autoencoders for Guided Representation Learning [9.97014316910538]
RF-AE is a kernel-based framework for out-of-sample unsupervised visualization.<n>It combines the flexibility of neuralencoders with the supervised strengths of random forests and the geometry by captured RF-PHATE data.<n>RF-AE is robust to the choice of kernel parameters and generalizes to any hyper-based dimensionality reduction method.
arXiv Detail & Related papers (2025-02-18T20:02:29Z) - A Hybrid Framework for Statistical Feature Selection and Image-Based Noise-Defect Detection [55.2480439325792]
This paper presents a hybrid framework that integrates both statistical feature selection and classification techniques to improve defect detection accuracy.<n>We present around 55 distinguished features that are extracted from industrial images, which are then analyzed using statistical methods.<n>By integrating these methods with flexible machine learning applications, the proposed framework improves detection accuracy and reduces false positives and misclassifications.
arXiv Detail & Related papers (2024-12-11T22:12:21Z) - Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [81.93945602120453]
We introduce an approach that is both general and parameter-efficient for face forgery detection.<n>We design a forgery-style mixture formulation that augments the diversity of forgery source domains.<n>We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - DistriBlock: Identifying adversarial audio samples by leveraging characteristics of the output distribution [16.74051650034954]
Adrial attacks can mislead automatic speech recognition systems into predicting an arbitrary target text.
We propose DistriBlock, an efficient detection strategy applicable to any ASR system that predicts a probability distribution over output tokens in each time step.
We demonstrate the supreme performance of this approach, with a mean area under the receiver operating characteristic curve for distinguishing target adversarial examples against clean and noisy data of 99% and 97%.
arXiv Detail & Related papers (2023-05-26T14:59:28Z) - Free Lunch for Generating Effective Outlier Supervision [46.37464572099351]
We propose an ultra-effective method to generate near-realistic outlier supervision.
Our proposed textttBayesAug significantly reduces the false positive rate over 12.50% compared with the previous schemes.
arXiv Detail & Related papers (2023-01-17T01:46:45Z) - Toward Certified Robustness Against Real-World Distribution Shifts [65.66374339500025]
We train a generative model to learn perturbations from data and define specifications with respect to the output of the learned model.
A unique challenge arising from this setting is that existing verifiers cannot tightly approximate sigmoid activations.
We propose a general meta-algorithm for handling sigmoid activations which leverages classical notions of counter-example-guided abstraction refinement.
arXiv Detail & Related papers (2022-06-08T04:09:13Z) - ADASYN-Random Forest Based Intrusion Detection Model [0.0]
Intrusion detection has been a key topic in the field of cyber security, and the common network threats nowadays have the characteristics of varieties and variation.
Considering the serious imbalance of intrusion detection datasets, using ADASYN oversampling method to balance datasets was proposed.
It has better performance, generalization ability and robustness compared with traditional machine learning models.
arXiv Detail & Related papers (2021-05-10T12:22:36Z) - Bayesian Optimization with Machine Learning Algorithms Towards Anomaly
Detection [66.05992706105224]
In this paper, an effective anomaly detection framework is proposed utilizing Bayesian Optimization technique.
The performance of the considered algorithms is evaluated using the ISCX 2012 dataset.
Experimental results show the effectiveness of the proposed framework in term of accuracy rate, precision, low-false alarm rate, and recall.
arXiv Detail & Related papers (2020-08-05T19:29:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.