A Lightweight Defense Mechanism against Next Generation of Phishing Emails using Distilled Attention-Augmented BiLSTM
- URL: http://arxiv.org/abs/2602.22250v1
- Date: Tue, 24 Feb 2026 20:06:45 GMT
- Title: A Lightweight Defense Mechanism against Next Generation of Phishing Emails using Distilled Attention-Augmented BiLSTM
- Authors: Morteza Eskandarian, Mahdi Rabbani, Arun Kaniyamattam, Fatemeh Nejati, Mansur Mirani, Gunjan Piya, Igor Opushnyev, Ali A. Ghorbani, Sajjad Dadkhah,
- Abstract summary: The MobileBERT teacher receives fine-tuning before its transformation into a BiLSTM model with multi-head attention.<n>The system demonstrates excellent performance in terms of accuracy and latency while maintaining a compact size.<n>The paper examines system performance under high traffic conditions and security measures for privacy protection and implementation methods for operational deployment.
- Score: 34.0814379994364
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The current generation of large language models produces sophisticated social-engineering content that bypasses standard text screening systems in business communication platforms. Our proposed solution for mail gateway and endpoint deception detection operates in a privacy-protective manner while handling the performance requirements of network and mobile security systems. The MobileBERT teacher receives fine-tuning before its transformation into a BiLSTM model with multi-head attention which maintains semantic discrimination only with 4.5 million parameters. The hybrid dataset contains human-written messages together with LLM-generated paraphrases that use masking techniques and personalization methods to enhance modern attack resistance. The evaluation system uses five testing protocols which include human-only and LLM-only tests and two cross-distribution transfer tests and a production-like mixed traffic test to assess performance in native environments and across different distribution types and combined traffic scenarios. The distilled model maintains a weighted-F1 score difference of 1-2.5 points compared to the mixture split results of strong transformer baselines including ModernBERT, DeBERTaV3-base, T5-base, DeepSeek-R1 Distill Qwen-1.5B and Phi-4 mini while achieving 80-95\% faster inference times and 95-99\% smaller model sizes. The system demonstrates excellent performance in terms of accuracy and latency while maintaining a compact size which enables real-time filtering without acceleration hardware and supports policy-based management. The paper examines system performance under high traffic conditions and security measures for privacy protection and implementation methods for operational deployment.
Related papers
- ProtoDCS: Towards Robust and Efficient Open-Set Test-Time Adaptation for Vision-Language Models [32.840734752367275]
Prototype-based Double-Check Separation (ProtoDCS) is a robust framework for OSTTA.<n>It separates csID and csOOD samples, enabling safe and efficient adaptation of Vision-Language Models to csID data.<n>ProtoDCS significantly boosts both known-class accuracy and OOD detection metrics.
arXiv Detail & Related papers (2026-02-27T03:39:02Z) - PRISM: Performer RS-IMLE for Single-pass Multisensory Imitation Learning [51.24484551729328]
We introduce PRISM, a single-pass policy based on a batch-global rejection-sampling variant of IMLE.<n> PRISM couples a temporal multisensory encoder with a linear-attention generator using a Performer architecture.<n>We demonstrate the efficacy of PRISM on a diverse real-world hardware suite, including loco-manipulation using a Unitree Go2 with a 7-DoF arm D1 and tabletop manipulation with a UR5 manipulator.
arXiv Detail & Related papers (2026-02-02T17:57:37Z) - Privacy-Preserving Offloading for Large Language Models in 6G Vehicular Networks [0.6524460254566904]
This paper presents a novel privacy-preserving offloading framework for 6G vehicular networks.<n>We introduce a hybrid approach combining federated learning (FL) and differential privacy (DP) techniques to protect user data.<n> Experimental results demonstrate that our approach achieves 75% global accuracy with only a 2-3% reduction compared to non-privacy-preserving methods.
arXiv Detail & Related papers (2025-08-30T10:08:28Z) - MCP-Guard: A Defense Framework for Model Context Protocol Integrity in Large Language Model Applications [21.70488724213541]
integration of Large Language Models with external tools introduces critical security vulnerabilities.<n>We propose MCP-Guard, a robust, layered defense architecture designed for LLM--tool interactions.<n>We also introduce MCP-AttackBench, a benchmark of over 70,000 samples.
arXiv Detail & Related papers (2025-08-14T18:00:25Z) - T2V-OptJail: Discrete Prompt Optimization for Text-to-Video Jailbreak Attacks [67.91652526657599]
We formalize the T2V jailbreak attack as a discrete optimization problem and propose a joint objective-based optimization framework, called T2V-OptJail.<n>We conduct large-scale experiments on several T2V models, covering both open-source models and real commercial closed-source models.<n>The proposed method improves 11.4% and 10.0% over the existing state-of-the-art method in terms of attack success rate.
arXiv Detail & Related papers (2025-05-10T16:04:52Z) - AegisLLM: Scaling Agentic Systems for Self-Reflective Defense in LLM Security [74.22452069013289]
AegisLLM is a cooperative multi-agent defense against adversarial attacks and information leakage.<n>We show that scaling agentic reasoning system at test-time substantially enhances robustness without compromising model utility.<n> Comprehensive evaluations across key threat scenarios, including unlearning and jailbreaking, demonstrate the effectiveness of AegisLLM.
arXiv Detail & Related papers (2025-04-29T17:36:05Z) - T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models [88.63040835652902]
Text to video models are vulnerable to jailbreak attacks, where specially crafted prompts bypass safety mechanisms and lead to the generation of harmful or unsafe content.<n>We propose T2VShield, a comprehensive and model agnostic defense framework designed to protect text to video models from jailbreak threats.<n>Our method systematically analyzes the input, model, and output stages to identify the limitations of existing defenses.
arXiv Detail & Related papers (2025-04-22T01:18:42Z) - PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing [48.30406812516552]
We introduce the PLM, a Peripheral Language Model, developed through a co-design process that jointly optimize model architecture and edge system constraints.<n>PLM employs a Multi-head Latent Attention mechanism and employs the squared ReLU activation function to encourage sparsity, thereby reducing peak memory footprint.<n> evaluation results demonstrate that PLM outperforms existing small language models trained on publicly available data.
arXiv Detail & Related papers (2025-03-15T15:11:17Z) - An Efficient Security Model for Industrial Internet of Things (IIoT) System Based on Machine Learning Principles [0.0]
This paper presents a security paradigm for edge devices to defend against various internal and external threats.<n>The proposed security paradigm is found to be effective against various internal and external threats and can be applied to a low-cost single-board computer.
arXiv Detail & Related papers (2025-02-10T14:20:13Z) - Efficient Federated Intrusion Detection in 5G ecosystem using optimized BERT-based model [0.7100520098029439]
5G offers advanced services, supporting applications such as intelligent transportation, connected healthcare, and smart cities within the Internet of Things (IoT)
These advancements introduce significant security challenges, with increasingly sophisticated cyber-attacks.
This paper proposes a robust intrusion detection system (IDS) using federated learning and large language models (LLMs)
arXiv Detail & Related papers (2024-09-28T15:56:28Z) - Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes [53.4856038354195]
Pre-trained large language models (LLMs) need fine-tuning to improve their responsiveness to natural language instructions.
FedKSeed employs zeroth-order optimization with a finite set of random seeds.
It significantly reduces transmission requirements between the server and clients to just a few random seeds.
arXiv Detail & Related papers (2023-12-11T13:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.