Related papers: WebGuard++:Interpretable Malicious URL Detection via Bidirectional Fusion of HTML Subgraphs and Multi-Scale Convolutional BERT

WebGuard++:Interpretable Malicious URL Detection via Bidirectional Fusion of HTML Subgraphs and Multi-Scale Convolutional BERT

URL: http://arxiv.org/abs/2506.19356v1
Date: Tue, 24 Jun 2025 06:36:51 GMT
Title: WebGuard++:Interpretable Malicious URL Detection via Bidirectional Fusion of HTML Subgraphs and Multi-Scale Convolutional BERT
Authors: Ye Tian, Zhang Yumin, Yifan Jia, Jianguo Sun, Yanbin Wang,
Abstract summary: URL+ HTML feature fusion shows promise for robust malicious URL detection, since attacker artifacts persist in DOM structures.<n>We present WebGuard++, a detection framework with 4 novel components.<n> Experiments show WebGuard++ achieves significant improvements over state-of-the-art baselines.
Score: 3.6220178465092503
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: URL+HTML feature fusion shows promise for robust malicious URL detection, since attacker artifacts persist in DOM structures. However, prior work suffers from four critical shortcomings: (1) incomplete URL modeling, failing to jointly capture lexical patterns and semantic context; (2) HTML graph sparsity, where threat-indicative nodes (e.g., obfuscated scripts) are isolated amid benign content, causing signal dilution during graph aggregation; (3) unidirectional analysis, ignoring URL-HTML feature bidirectional interaction; and (4) opaque decisions, lacking attribution to malicious DOM components. To address these challenges, we present WebGuard++, a detection framework with 4 novel components: 1) Cross-scale URL Encoder: Hierarchically learns local-to-global and coarse to fine URL features based on Transformer network with dynamic convolution. 2) Subgraph-aware HTML Encoder: Decomposes DOM graphs into interpretable substructures, amplifying sparse threat signals via Hierarchical feature fusion. 3) Bidirectional Coupling Module: Aligns URL and HTML embeddings through cross-modal contrastive learning, optimizing inter-modal consistency and intra-modal specificity. 4) Voting Module: Localizes malicious regions through consensus voting on malicious subgraph predictions. Experiments show WebGuard++ achieves significant improvements over state-of-the-art baselines, achieving 1.1x-7.9x higher TPR at fixed FPR of 0.001 and 0.0001 across both datasets.

Related papers

Breaking Obfuscation: Cluster-Aware Graph with LLM-Aided Recovery for Malicious JavaScript Detection [9.83040332336481]
Malicious JavaScript code poses significant threats to user privacy, system integrity, and enterprise security.<n>We propose DeCoda, a hybrid defense framework that combines large language model (LLM)-based deobfuscation with code graph learning.
arXiv Detail & Related papers (2025-07-30T07:46:49Z)
SG-Reg: Generalizable and Efficient Scene Graph Registration [23.3853919684438]
We design a scene graph network to encode multiple modalities of semantic nodes.<n>In the back-end, we employ a robust pose estimator to decide transformation according to the correspondences.<n>Our method achieves a slightly higher registration recall while requiring only 52 KB of communication bandwidth for each query frame.
arXiv Detail & Related papers (2025-04-20T01:22:40Z)
SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding [56.079013202051094]
We present SegVG, a novel method transfers the box-level annotation as signals to provide an additional pixel-level supervision for Visual Grounding. This approach allows us to iteratively exploit the annotation as signals for both box-level regression and pixel-level segmentation.
arXiv Detail & Related papers (2024-07-03T15:30:45Z)
Part-aware Unified Representation of Language and Skeleton for Zero-shot Action Recognition [57.97930719585095]
We introduce Part-aware Unified Representation between Language and Skeleton (PURLS) to explore visual-semantic alignment at both local and global scales. Our approach is evaluated on various skeleton/language backbones and three large-scale datasets. The results showcase the universality and superior performance of PURLS, surpassing prior skeleton-based solutions and standard baselines from other domains.
arXiv Detail & Related papers (2024-06-19T08:22:32Z)
DGMamba: Domain Generalization via Generalized State Space Model [80.82253601531164]
Domain generalization(DG) aims at solving distribution shift problems in various scenes. Mamba, as an emerging state space model (SSM), possesses superior linear complexity and global receptive fields. We propose a novel framework for DG, named DGMamba, that excels in strong generalizability toward unseen domains.
arXiv Detail & Related papers (2024-04-11T14:35:59Z)
RAGFormer: Learning Semantic Attributes and Topological Structure for Fraud Detection [8.050935113945428]
We present a novel framework called Relation-Aware GNN with transFormer(RAGFormer)<n>RAGFormer embeds both semantic and topological features into a target node.<n>The simple yet effective network consists of a semantic encoder, a topology encoder, and an attention fusion module.
arXiv Detail & Related papers (2024-02-27T12:53:15Z)
TransURL: Improving malicious URL detection with multi-layer Transformer encoding and multi-scale pyramid features [9.873643699502853]
We propose a novel approach for malicious URL detection, named TransURL.<n>This method is implemented by co-training the character-aware Transformer with three feature modules.<n> Experimental results demonstrate a significant improvement compared to previous methods.
arXiv Detail & Related papers (2023-12-01T11:27:00Z)
PMANet: Malicious URL detection via post-trained language model guided multi-level feature attention network [16.73322002436809]
We propose PMANet, a pre-trained Language Model-Guided multi-level feature attention network.<n>PMANet employs a post-training process with three self-supervised objectives: masked language modeling, noisy language modeling, and domain discrimination.<n> Experiments on diverse scenarios, including small-scale data, class imbalance, and adversarial attacks, demonstrate PMANet's superiority over state-of-the-art models.
arXiv Detail & Related papers (2023-11-21T06:23:08Z)
Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features. Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z)
Software Vulnerability Detection via Deep Learning over Disaggregated Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora. Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z)
Attentive WaveBlock: Complementarity-enhanced Mutual Networks for Unsupervised Domain Adaptation in Person Re-identification and Beyond [97.25179345878443]
This paper proposes a novel light-weight module, the Attentive WaveBlock (AWB) AWB can be integrated into the dual networks of mutual learning to enhance the complementarity and further depress noise in the pseudo-labels. Experiments demonstrate that the proposed method achieves state-of-the-art performance with significant improvements on multiple UDA person re-identification tasks.
arXiv Detail & Related papers (2020-06-11T15:40:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.