Phishing Email Detection Using Inputs From Artificial Intelligence
- URL: http://arxiv.org/abs/2405.12494v1
- Date: Tue, 21 May 2024 04:37:23 GMT
- Title: Phishing Email Detection Using Inputs From Artificial Intelligence
- Authors: Mithün Paul, Genevieve Bartlett, Jelena Mirkovic, Marjorie Freedman,
- Abstract summary: We present a dataset with annotated labels where these labels are created from the classes of signals that users are typically asked to identify in such training.
With a comparative analysis of performance between human annotators and the models on these labels, we provide insights which can contribute to the improvement of the curricula for both machine and human training.
- Score: 5.172061216433
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Enterprise security is increasingly being threatened by social engineering attacks, such as phishing, which deceive employees into giving access to enterprise data. To protect both the users themselves and enterprise data, more and more organizations provide cyber security training that seeks to teach employees/customers to identify and report suspicious content. By its very nature, such training seeks to focus on signals that are likely to persist across a wide range of attacks. Further, it expects the user to apply the learnings from these training on e-mail messages that were not filtered by existing, automatic enterprise security (e.g., spam filters and commercial phishing detection software). However, relying on such training now shifts the detection of phishing from an automatic process to a human driven one which is fallible especially when a user errs due to distraction, forgetfulness, etc. In this work we explore treating this type of detection as a natural language processing task and modifying training pipelines accordingly. We present a dataset with annotated labels where these labels are created from the classes of signals that users are typically asked to identify in such training. We also present baseline classifier models trained on these classes of labels. With a comparative analysis of performance between human annotators and the models on these labels, we provide insights which can contribute to the improvement of the respective curricula for both machine and human training.
Related papers
- Implementing Active Learning in Cybersecurity: Detecting Anomalies in
Redacted Emails [10.303697869042283]
We present research results concerning the application of Active Learning to anomaly detection in redacted emails.
We evaluate different AL strategies and their impact on resulting model performance.
arXiv Detail & Related papers (2023-03-01T23:53:01Z) - Backdoor Cleansing with Unlabeled Data [70.29989887008209]
externally trained Deep Neural Networks (DNNs) can potentially be backdoor attacked.
We propose a novel defense method that does not require training labels.
Our method, trained without labels, is on-par with state-of-the-art defense methods trained using labels.
arXiv Detail & Related papers (2022-11-22T06:29:30Z) - Label Flipping Data Poisoning Attack Against Wearable Human Activity
Recognition System [0.5284812806199193]
This paper presents the design of a label flipping data poisoning attack for a Human Activity Recognition (HAR) system.
Due to high noise and uncertainty in the sensing environment, such an attack poses a severe threat to the recognition system.
This paper shades light on how to carry out the attack in practice through smartphone-based sensor data collection applications.
arXiv Detail & Related papers (2022-08-17T17:52:13Z) - Towards Automated Classification of Attackers' TTPs by combining NLP
with ML Techniques [77.34726150561087]
We evaluate and compare different Natural Language Processing (NLP) and machine learning techniques used for security information extraction in research.
Based on our investigations we propose a data processing pipeline that automatically classifies unstructured text according to attackers' tactics and techniques.
arXiv Detail & Related papers (2022-07-18T09:59:21Z) - Email Summarization to Assist Users in Phishing Identification [1.433758865948252]
Cyber-phishing attacks are more precise, targeted, and tailored by training data to activate only in the presence of specific information or cues.
This work leverages transformer-based machine learning to analyze prospective psychological triggers.
We then amalgamate this information and present it to the user to allow them to (i) easily decide whether the email is "phishy" and (ii) self-learn advanced malicious patterns.
arXiv Detail & Related papers (2022-03-24T23:03:46Z) - Phishing Attacks Detection -- A Machine Learning-Based Approach [0.6445605125467573]
Phishing attacks are one of the most common social engineering attacks targeting users emails to fraudulently steal confidential and sensitive information.
In this paper, we proposed a phishing attack detection technique based on machine learning.
We collected and analyzed more than 4000 phishing emails targeting the email service of the University of North Dakota.
arXiv Detail & Related papers (2022-01-26T05:08:27Z) - Attribute Inference Attack of Speech Emotion Recognition in Federated
Learning Settings [56.93025161787725]
Federated learning (FL) is a distributed machine learning paradigm that coordinates clients to train a model collaboratively without sharing local data.
We propose an attribute inference attack framework that infers sensitive attribute information of the clients from shared gradients or model parameters.
We show that the attribute inference attack is achievable for SER systems trained using FL.
arXiv Detail & Related papers (2021-12-26T16:50:42Z) - Deep convolutional forest: a dynamic deep ensemble approach for spam
detection in text [219.15486286590016]
This paper introduces a dynamic deep ensemble model for spam detection that adjusts its complexity and extracts features automatically.
As a result, the model achieved high precision, recall, f1-score and accuracy of 98.38%.
arXiv Detail & Related papers (2021-10-10T17:19:37Z) - Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks,
and Defenses [150.64470864162556]
This work systematically categorizes and discusses a wide range of dataset vulnerabilities and exploits.
In addition to describing various poisoning and backdoor threat models and the relationships among them, we develop their unified taxonomy.
arXiv Detail & Related papers (2020-12-18T22:38:47Z) - Backdoor Attack against Speaker Verification [86.43395230456339]
We show that it is possible to inject the hidden backdoor for infecting speaker verification models by poisoning the training data.
We also demonstrate that existing backdoor attacks cannot be directly adopted in attacking speaker verification.
arXiv Detail & Related papers (2020-10-22T11:10:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.