Related papers: Evasion-Resilient Detection of DNS-over-HTTPS Data Exfiltration: A Practical Evaluation and Toolkit

Evasion-Resilient Detection of DNS-over-HTTPS Data Exfiltration: A Practical Evaluation and Toolkit

URL: http://arxiv.org/abs/2512.20423v1
Date: Tue, 23 Dec 2025 15:07:17 GMT
Title: Evasion-Resilient Detection of DNS-over-HTTPS Data Exfiltration: A Practical Evaluation and Toolkit
Authors: Adam Elaoumari,
Abstract summary: This project aims to assess how well defenders can detect DNS-over-HTTPS (DoH) file exfiltration, and which evasion strategies can be used by attackers.<n>The originality of this project is the introduction of an end-to-end, containerized pipeline that generates file exfiltration over DoH.<n>The pipeline contains a prediction side, which allows the training of machine learning models based on public labelled datasets.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The purpose of this project is to assess how well defenders can detect DNS-over-HTTPS (DoH) file exfiltration, and which evasion strategies can be used by attackers. While providing a reproducible toolkit to generate, intercept and analyze DoH exfiltration, and comparing Machine Learning vs threshold-based detection under adversarial scenarios. The originality of this project is the introduction of an end-to-end, containerized pipeline that generates configurable file exfiltration over DoH using several parameters (e.g., chunking, encoding, padding, resolver rotation). It allows for file reconstruction at the resolver side, while extracting flow-level features using a fork of DoHLyzer. The pipeline contains a prediction side, which allows the training of machine learning models based on public labelled datasets and then evaluates them side-by-side with threshold-based detection methods against malicious and evasive DNS-Over-HTTPS traffic. We train Random Forest, Gradient Boosting and Logistic Regression classifiers on a public DoH dataset and benchmark them against evasive DoH exfiltration scenarios. The toolkit orchestrates traffic generation, file capture, feature extraction, model training and analysis. The toolkit is then encapsulated into several Docker containers for easy setup and full reproducibility regardless of the platform it is run on. Future research regarding this project is directed at validating the results on mixed enterprise traffic, extending the protocol coverage to HTTP/3/QUIC request, adding a benign traffic generation, and working on real-time traffic evaluation. A key objective is to quantify when stealth constraints make DoH exfiltration uneconomical and unworthy for the attacker.

Related papers

Silent Egress: When Implicit Prompt Injection Makes LLM Agents Leak Without a Trace [0.0]
We show that adversarial instructions embedded in automatically generated URL previews can introduce a system-level risk that we refer to as silent egress.<n>Using a fully local and reproducible testbed, we demonstrate that a malicious web page can induce an agent to issue outbound requests that exfiltrate sensitive runtime context.<n>In 480 experimental runs with a qwen2.5:7b-based agent, the attack succeeds with high probability (P (egress) =0.89), and 95% of successful attacks are not detected by output-based safety checks.
arXiv Detail & Related papers (2026-02-25T22:26:23Z)
Automated and Explainable Denial of Service Analysis for AI-Driven Intrusion Detection Systems [5.975446818626117]
This paper presents an automated framework for detecting and interpreting DDoS attacks using machine learning (ML)<n>By combining TPOT's automated pipeline selection with SHAP interpretability, this approach improves the accuracy and transparency of DDoS detection.
arXiv Detail & Related papers (2025-11-06T07:01:38Z)
FL-PLAS: Federated Learning with Partial Layer Aggregation for Backdoor Defense Against High-Ratio Malicious Clients [7.1383449614815415]
Federated learning (FL) is gaining increasing attention as an emerging collaborative machine learning approach.<n>The fundamental algorithm of FL, Federated Averaging (FedAvg), is susceptible to backdoor attacks.<n>We propose a novel defense algorithm, FL-PLAS, which can effectively protect local models from backdoor attacks.
arXiv Detail & Related papers (2025-05-17T14:16:47Z)
CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling [63.07948989346385]
Federated learning collaboratively trains a neural network on a global server.<n>Each local client receives the current global model weights and sends back parameter updates (gradients) based on its local private data.<n>Existing gradient inversion attacks can exploit this vulnerability to recover private training instances from a client's gradient vectors.<n>We present a novel defense tailored for large neural network models.
arXiv Detail & Related papers (2025-01-27T01:06:23Z)
AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries. We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots. We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z)
A Flow is a Stream of Packets: A Stream-Structured Data Approach for DDoS Detection [32.22817720403158]
We propose a new tree-based DDoS detection approach that operates on a flow as a stream structure. Our approach matches or exceeds existing machine learning techniques' accuracy, including state-of-the-art deep learning methods.
arXiv Detail & Related papers (2024-05-12T09:29:59Z)
CrowdGuard: Federated Backdoor Detection in Federated Learning [39.58317527488534]
This paper presents a novel defense mechanism, CrowdGuard, that effectively mitigates backdoor attacks in Federated Learning. CrowdGuard employs a server-located stacked clustering scheme to enhance its resilience to rogue client feedback. The evaluation results demonstrate that CrowdGuard achieves a 100% True-Positive-Rate and True-Negative-Rate across various scenarios.
arXiv Detail & Related papers (2022-10-14T11:27:49Z)
Black-box Dataset Ownership Verification via Backdoor Watermarking [67.69308278379957]
We formulate the protection of released datasets as verifying whether they are adopted for training a (suspicious) third-party model. We propose to embed external patterns via backdoor watermarking for the ownership verification to protect them. Specifically, we exploit poison-only backdoor attacks ($e.g.$, BadNets) for dataset watermarking and design a hypothesis-test-guided method for dataset verification.
arXiv Detail & Related papers (2022-08-04T05:32:20Z)
Finding Phish in a Haystack: A Pipeline for Phishing Classification on Certificate Transparency Logs [0.5512295869673147]
phishing prevention techniques mainly utilize reactive blocklists, which leave a window of opportunity'' for attackers during which victims are unprotected. One possible approach to shorten this window aims to detect phishing attacks earlier, during website preparation, by monitoring Certificate Transparency (CT) logs. Previous attempts to work with CT log data for phishing classification exist, however they lack evaluations on actual CT log data. We present a pipeline that facilitates such evaluations by addressing a number of problems when working with CT log data.
arXiv Detail & Related papers (2021-06-23T12:24:19Z)
Attribution of Gradient Based Adversarial Attacks for Reverse Engineering of Deceptions [16.23543028393521]
We present two techniques that support automated identification and attribution of adversarial ML attack toolchains. To the best of our knowledge, this is the first approach to attribute gradient based adversarial attacks and estimate their parameters.
arXiv Detail & Related papers (2021-03-19T19:55:00Z)
Malware Traffic Classification: Evaluation of Algorithms and an Automated Ground-truth Generation Pipeline [8.779666771357029]
We propose an automated packet data-labeling pipeline to generate ground-truth data. We explore and test different kind of clustering approaches which make use of unique and diverse set of features extracted from this observable meta-data.
arXiv Detail & Related papers (2020-10-22T11:48:51Z)
Adversarial EXEmples: A Survey and Experimental Evaluation of Practical Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes. We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks. These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z)
End-to-End Object Detection with Transformers [88.06357745922716]
We present a new method that views object detection as a direct set prediction problem. Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components. The main ingredients of the new framework, called DEtection TRansformer or DETR, are a set-based global loss.
arXiv Detail & Related papers (2020-05-26T17:06:38Z)
PyODDS: An End-to-end Outlier Detection System with Automated Machine Learning [55.32009000204512]
We present PyODDS, an automated end-to-end Python system for Outlier Detection with Database Support. Specifically, we define the search space in the outlier detection pipeline, and produce a search strategy within the given search space. It also provides unified interfaces and visualizations for users with or without data science or machine learning background.
arXiv Detail & Related papers (2020-03-12T03:30:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.