Related papers: Different Victims, Same Layout: Email Visual Similarity Detection for Enhanced Email Protection

Different Victims, Same Layout: Email Visual Similarity Detection for Enhanced Email Protection

URL: http://arxiv.org/abs/2408.16945v3
Date: Wed, 4 Sep 2024 14:25:47 GMT
Title: Different Victims, Same Layout: Email Visual Similarity Detection for Enhanced Email Protection
Authors: Sachin Shukla, Omid Mirzaei,
Abstract summary: We propose an email visual similarity detection approach, named Pisco, to improve the detection capabilities of an email threat defense system. Our results show that email kits are being reused extensively and visually similar emails are sent to our customers at various time intervals.
Score: 0.3683202928838613
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In the pursuit of an effective spam detection system, the focus has often been on identifying known spam patterns either through rule-based detection systems or machine learning (ML) solutions that rely on keywords. However, both systems are susceptible to evasion techniques and zero-day attacks that can be achieved at low cost. Therefore, an email that bypassed the defense system once can do it again in the following days, even though rules are updated or the ML models are retrained. The recurrence of failures to detect emails that exhibit layout similarities to previously undetected spam is concerning for customers and can erode their trust in a company. Our observations show that threat actors reuse email kits extensively and can bypass detection with little effort, for example, by making changes to the content of emails. In this work, we propose an email visual similarity detection approach, named Pisco, to improve the detection capabilities of an email threat defense system. We apply our proof of concept to some real-world samples received from different sources. Our results show that email kits are being reused extensively and visually similar emails are sent to our customers at various time intervals. Therefore, this method could be very helpful in situations where detection engines that rely on textual features and keywords are bypassed, an occurrence our observations show happens frequently.

Related papers

LLM-Powered Intent-Based Categorization of Phishing Emails [0.0]
This paper investigates the practical potential of Large Language Models (LLMs) to detect phishing emails by focusing on their intent.<n>We introduce an intent-type taxonomy, which is operationalized by the LLMs to classify emails into distinct categories and, therefore, generate actionable threat information.<n>Our results demonstrate that existing LLMs are capable of detecting and categorizing phishing emails, underscoring their potential in this domain.
arXiv Detail & Related papers (2025-06-17T09:21:55Z)
ChatSpamDetector: Leveraging Large Language Models for Effective Phishing Email Detection [2.3999111269325266]
This study introduces ChatSpamDetector, a system that uses large language models (LLMs) to detect phishing emails. By converting email data into a prompt suitable for LLM analysis, the system provides a highly accurate determination of whether an email is phishing or not. We conducted an evaluation using a comprehensive phishing email dataset and compared our system to several LLMs and baseline systems.
arXiv Detail & Related papers (2024-02-28T06:28:15Z)
Prompted Contextual Vectors for Spear-Phishing Detection [45.07804966535239]
Spear-phishing attacks present a significant security challenge. We propose a detection approach based on a novel document vectorization method. Our method achieves a 91% F1 score in identifying LLM-generated spear-phishing emails.
arXiv Detail & Related papers (2024-02-13T09:12:55Z)
Verifying the Robustness of Automatic Credibility Assessment [50.55687778699995]
We show that meaning-preserving changes in input text can mislead the models. We also introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks. Our experimental results show that modern large language models are often more vulnerable to attacks than previous, smaller solutions.
arXiv Detail & Related papers (2023-03-14T16:11:47Z)
Profiler: Profile-Based Model to Detect Phishing Emails [15.109679047753355]
We propose a multidimensional risk assessment of emails to reduce the feasibility of an attacker adapting their email and avoiding detection. We develop a risk assessment framework that includes three models which analyse an email's (1) threat level, (2) cognitive manipulation, and (3) email type. Our Profiler can be used in conjunction with ML approaches, to reduce their misclassifications or as a labeller for large email data sets in the training stage.
arXiv Detail & Related papers (2022-08-18T10:01:55Z)
Anomaly Detection in Emails using Machine Learning and Header Information [0.0]
Anomalies in emails such as phishing and spam present major security risks. Previous studies on email anomaly detection relied on a single type of anomaly and the analysis of the email body and subject content. This study conducted feature extraction and selection on email header datasets and leveraged both multi and one-class anomaly detection approaches.
arXiv Detail & Related papers (2022-03-19T23:31:23Z)
Holmes: An Efficient and Lightweight Semantic Based Anomalous Email Detector [1.926698798754349]
We present Holmes, an efficient and lightweight semantic based engine for anomalous email detection. Based on our observations, we claim that, in an enterprise environment, there is a stable relation between senders and receivers, but suspicious emails are commonly from unusual sources. We evaluate the performance of Holmes in a real-world enterprise environment, in which it sends and receives around 5,000 emails each day.
arXiv Detail & Related papers (2021-04-16T11:42:10Z)
Effective Email Spam Detection System using Extreme Gradient Boosting [1.8899300124593645]
This research is an improved spam detection model based on Extreme Gradient Boosting (XGBoost) Experimental results show that the proposed model outperforms earlier approaches across a wide range of evaluation metrics.
arXiv Detail & Related papers (2020-12-27T15:23:58Z)
Detection of Adversarial Supports in Few-shot Classifiers Using Feature Preserving Autoencoders and Self-Similarity [89.26308254637702]
We propose a detection strategy to highlight adversarial support sets. We make use of feature preserving autoencoder filtering and also the concept of self-similarity of a support set to perform this detection. Our method is attack-agnostic and also the first to explore detection for few-shot classifiers to the best of our knowledge.
arXiv Detail & Related papers (2020-12-09T14:13:41Z)
Robust and Verifiable Information Embedding Attacks to Deep Neural Networks via Error-Correcting Codes [81.85509264573948]
In the era of deep learning, a user often leverages a third-party machine learning tool to train a deep neural network (DNN) classifier. In an information embedding attack, an attacker is the provider of a malicious third-party machine learning tool. In this work, we aim to design information embedding attacks that are verifiable and robust against popular post-processing methods.
arXiv Detail & Related papers (2020-10-26T17:42:42Z)
Robust Spammer Detection by Nash Reinforcement Learning [64.80986064630025]
We develop a minimax game where the spammers and spam detectors compete with each other on their practical goals. We show that an optimization algorithm can reliably find an equilibrial detector that can robustly prevent spammers with any mixed spamming strategies from attaining their practical goal.
arXiv Detail & Related papers (2020-06-10T21:18:07Z)
Learning with Weak Supervision for Email Intent Detection [56.71599262462638]
We propose to leverage user actions as a source of weak supervision to detect intents in emails. We develop an end-to-end robust deep neural network model for email intent identification.
arXiv Detail & Related papers (2020-05-26T23:41:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.