Phishing Codebook: A Structured Framework for the Characterization of Phishing Emails
- URL: http://arxiv.org/abs/2408.08967v1
- Date: Fri, 16 Aug 2024 18:30:53 GMT
- Title: Phishing Codebook: A Structured Framework for the Characterization of Phishing Emails
- Authors: Tarini Saka, Rachiyta Jain, Kami Vaniea, Nadin Kökciyan,
- Abstract summary: Phishing is one of the most prevalent and expensive types of cybercrime faced by organizations and individuals worldwide.
Most prior research has focused on various technical features and traditional representations of text to characterize phishing emails.
In this paper, we dissect the structure of phishing emails to gain a better understanding of the factors that influence human decision-making.
- Score: 17.173114048398954
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Phishing is one of the most prevalent and expensive types of cybercrime faced by organizations and individuals worldwide. Most prior research has focused on various technical features and traditional representations of text to characterize phishing emails. There is a significant knowledge gap about the qualitative traits embedded in them, which could be useful in a range of phishing mitigation tasks. In this paper, we dissect the structure of phishing emails to gain a better understanding of the factors that influence human decision-making when assessing suspicious emails and identify a novel set of descriptive features. For this, we employ an iterative qualitative coding approach to identify features that are descriptive of the emails. We developed the ``Phishing Codebook'', a structured framework to systematically extract key information from phishing emails, and we apply this codebook to a publicly available dataset of 503 phishing emails collected between 2015 and 2021. We present key observations and challenges related to phishing attacks delivered indirectly through legitimate services, the challenge of recurring and long-lasting scams, and the variations within campaigns used by attackers to bypass rule-based filters. Furthermore, we provide two use cases to show how the Phishing Codebook is useful in identifying similar phishing emails and in creating well-tailored responses to end-users. We share the Phishing Codebook and the annotated benchmark dataset to help researchers have a better understanding of phishing emails.
Related papers
- Eyes on the Phish(er): Towards Understanding Users' Email Processing Pattern and Mental Models in Phishing Detection [0.4543820534430522]
This study examines how workload affects susceptibility to phishing.
We use eye-tracking technology to observe participants' reading patterns and interactions with phishing emails.
Our results provide concrete evidence that attention to the email sender can reduce phishing susceptibility.
arXiv Detail & Related papers (2024-09-12T02:57:49Z) - ChatSpamDetector: Leveraging Large Language Models for Effective Phishing Email Detection [2.3999111269325266]
This study introduces ChatSpamDetector, a system that uses large language models (LLMs) to detect phishing emails.
By converting email data into a prompt suitable for LLM analysis, the system provides a highly accurate determination of whether an email is phishing or not.
We conducted an evaluation using a comprehensive phishing email dataset and compared our system to several LLMs and baseline systems.
arXiv Detail & Related papers (2024-02-28T06:28:15Z) - An Innovative Information Theory-based Approach to Tackle and Enhance The Transparency in Phishing Detection [23.962076093344166]
We propose an innovative deep learning-based approach for phishing attack localization.
Our method can not only predict the vulnerability of the email data but also automatically learn and figure out the most important and phishing-relevant information.
arXiv Detail & Related papers (2024-02-27T00:03:07Z) - Prompted Contextual Vectors for Spear-Phishing Detection [45.07804966535239]
Spear-phishing attacks present a significant security challenge.
We propose a detection approach based on a novel document vectorization method.
Our method achieves a 91% F1 score in identifying LLM-generated spear-phishing emails.
arXiv Detail & Related papers (2024-02-13T09:12:55Z) - SoK: Human-Centered Phishing Susceptibility [4.794822439017277]
We propose a three-stage Phishing Susceptibility Model (PSM) for explaining how humans are involved in phishing detection and prevention.
This model reveals several research gaps that need to be addressed to improve users' detection performance.
arXiv Detail & Related papers (2022-02-16T07:26:53Z) - Deep convolutional forest: a dynamic deep ensemble approach for spam
detection in text [219.15486286590016]
This paper introduces a dynamic deep ensemble model for spam detection that adjusts its complexity and extracts features automatically.
As a result, the model achieved high precision, recall, f1-score and accuracy of 98.38%.
arXiv Detail & Related papers (2021-10-10T17:19:37Z) - Fine-Grained Element Identification in Complaint Text of Internet Fraud [47.249423877146604]
We propose to analyse the complaint text of internet fraud in a fine-grained manner.
Considering the complaint text includes multiple clauses with various functions, we propose to identify the role of each clause and classify them into different types of fraud element.
arXiv Detail & Related papers (2021-08-19T13:33:09Z) - Phishing Detection through Email Embeddings [2.099922236065961]
The problem of detecting phishing emails through machine learning techniques has been discussed extensively in the literature.
In this paper, we crafted a set of phishing and legitimate emails with similar indicators in order to investigate whether these cues are captured or disregarded by email embeddings.
Our results show that using these indicators, email embeddings techniques is effective for classifying emails as phishing or legitimate.
arXiv Detail & Related papers (2020-12-28T21:16:41Z) - Robust and Verifiable Information Embedding Attacks to Deep Neural
Networks via Error-Correcting Codes [81.85509264573948]
In the era of deep learning, a user often leverages a third-party machine learning tool to train a deep neural network (DNN) classifier.
In an information embedding attack, an attacker is the provider of a malicious third-party machine learning tool.
In this work, we aim to design information embedding attacks that are verifiable and robust against popular post-processing methods.
arXiv Detail & Related papers (2020-10-26T17:42:42Z) - Phishing and Spear Phishing: examples in Cyber Espionage and techniques
to protect against them [91.3755431537592]
Phishing attacks have become the most used technique in the online scams, initiating more than 91% of cyberattacks, from 2012 onwards.
This study reviews how Phishing and Spear Phishing attacks are carried out by the phishers, through 5 steps which magnify the outcome.
arXiv Detail & Related papers (2020-05-31T18:10:09Z) - Learning with Weak Supervision for Email Intent Detection [56.71599262462638]
We propose to leverage user actions as a source of weak supervision to detect intents in emails.
We develop an end-to-end robust deep neural network model for email intent identification.
arXiv Detail & Related papers (2020-05-26T23:41:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.