Decoding the MITRE Engenuity ATT&CK Enterprise Evaluation: An Analysis of EDR Performance in Real-World Environments
- URL: http://arxiv.org/abs/2401.15878v2
- Date: Mon, 22 Apr 2024 20:42:27 GMT
- Title: Decoding the MITRE Engenuity ATT&CK Enterprise Evaluation: An Analysis of EDR Performance in Real-World Environments
- Authors: Xiangmin Shen, Zhenyuan Li, Graham Burleigh, Lingzhi Wang, Yan Chen,
- Abstract summary: In light of the growing significance of endpoint detection and response (EDR) systems, many cybersecurity providers have developed their own solutions.
It's crucial for users to assess the capabilities of these detection engines to make informed decisions about which products to choose.
In this paper, we thoroughly analyzed MITRE evaluation results to gain further insights into real-world systems under test.
- Score: 4.554190270271114
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Endpoint detection and response (EDR) systems have emerged as a critical component of enterprise security solutions, effectively combating endpoint threats like APT attacks with extended lifecycles. In light of the growing significance of endpoint detection and response (EDR) systems, many cybersecurity providers have developed their own proprietary EDR solutions. It's crucial for users to assess the capabilities of these detection engines to make informed decisions about which products to choose. This is especially urgent given the market's size, which is expected to reach around 3.7 billion dollars by 2023 and is still expanding. MITRE is a leading organization in cyber threat analysis. In 2018, MITRE started to conduct annual APT emulations that cover major EDR vendors worldwide. Indicators include telemetry, detection and blocking capability, etc. Nevertheless, the evaluation results published by MITRE don't contain any further interpretations or suggestions. In this paper, we thoroughly analyzed MITRE evaluation results to gain further insights into real-world EDR systems under test. Specifically, we designed a whole-graph analysis method, which utilizes additional control flow and data flow information to measure the performance of EDR systems. Besides, we analyze MITRE evaluation's results over multiple years from various aspects, including detection coverage, detection confidence, detection modifier, data source, compatibility, etc. Through the above studies, we have compiled a thorough summary of our findings and gained valuable insights from the evaluation results. We believe these summaries and insights can assist researchers, practitioners, and vendors in better understanding the strengths and limitations of mainstream EDR products.
Related papers
- RCAEval: A Benchmark for Root Cause Analysis of Microservice Systems with Telemetry Data [13.68949728404533]
Root cause analysis (RCA) for microservice systems has gained significant attention in recent years.
There is still no standard benchmark that includes large-scale datasets and supports comprehensive evaluation environments.
We introduce RCAEval, an open-source benchmark that provides datasets and an evaluation environment for RCA in microservice systems.
arXiv Detail & Related papers (2024-12-22T13:30:02Z) - A Scalable Data-Driven Framework for Systematic Analysis of SEC 10-K Filings Using Large Language Models [0.0]
We propose a novel data-driven approach to analyze and rate the performance of companies based on their SEC 10-K filings.
The proposed scheme is then implemented on an interactive GUI as a no-code solution for running the data pipeline and creating the visualizations.
The application showcases the rating results and provides year-on-year comparisons of company performance.
arXiv Detail & Related papers (2024-09-26T06:57:22Z) - Trustworthiness in Retrieval-Augmented Generation Systems: A Survey [59.26328612791924]
Retrieval-Augmented Generation (RAG) has quickly grown into a pivotal paradigm in the development of Large Language Models (LLMs)
We propose a unified framework that assesses the trustworthiness of RAG systems across six key dimensions: factuality, robustness, fairness, transparency, accountability, and privacy.
arXiv Detail & Related papers (2024-09-16T09:06:44Z) - Long-Range Biometric Identification in Real World Scenarios: A Comprehensive Evaluation Framework Based on Missions [11.557368031775717]
This paper evaluates research solutions for identifying individuals at ranges and altitudes.
By fusing face and body features, we propose developing robust biometric systems for effective long-range identification.
arXiv Detail & Related papers (2024-09-03T02:17:36Z) - EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.
Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.
However, the deployment of these agents in physical environments presents significant safety challenges.
This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective [111.58315434849047]
robustness of neural information retrieval models (IR) models has garnered significant attention.
We view the robustness of IR to be a multifaceted concept, emphasizing its necessity against adversarial attacks, out-of-distribution (OOD) scenarios and performance variance.
We provide an in-depth discussion of existing methods, datasets, and evaluation metrics, shedding light on challenges and future directions in the era of large language models.
arXiv Detail & Related papers (2024-07-09T16:07:01Z) - Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models [52.368110271614285]
We introduce AdvEval, a novel black-box adversarial framework against NLG evaluators.
AdvEval is specially tailored to generate data that yield strong disagreements between human and victim evaluators.
We conduct experiments on 12 victim evaluators and 11 NLG datasets, spanning tasks including dialogue, summarization, and question evaluation.
arXiv Detail & Related papers (2024-05-23T14:48:15Z) - The VoicePrivacy 2020 Challenge: Results and findings [60.13468541150838]
The first VoicePrivacy 2020 Challenge focuses on developing anonymization solutions for speech technology.
We provide a systematic overview of the challenge design with an analysis of submitted systems and evaluation results.
arXiv Detail & Related papers (2021-09-01T23:40:38Z) - RANK: AI-assisted End-to-End Architecture for Detecting Persistent
Attacks in Enterprise Networks [2.294014185517203]
We present an end-to-end AI-assisted architecture for detecting Advanced Persistent Threats (APTs)
The architecture is composed of four consecutive steps: 1) alert templating and merging, 2) alert graph construction, 3) alert graph partitioning into incidents, and 4) incident scoring and ordering.
Extensive results are provided showing a three order of magnitude reduction in the amount of data to be reviewed by the analyst, innovative extraction of incidents and security-wise scoring of extracted incidents.
arXiv Detail & Related papers (2021-01-06T15:59:51Z) - Deep Learning for Person Re-identification: A Survey and Outlook [233.36948173686602]
Person re-identification (Re-ID) aims at retrieving a person of interest across multiple non-overlapping cameras.
By dissecting the involved components in developing a person Re-ID system, we categorize it into the closed-world and open-world settings.
arXiv Detail & Related papers (2020-01-13T12:49:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.