An Improved Transformer-based Model for Detecting Phishing, Spam, and
Ham: A Large Language Model Approach
- URL: http://arxiv.org/abs/2311.04913v2
- Date: Sun, 12 Nov 2023 16:32:16 GMT
- Title: An Improved Transformer-based Model for Detecting Phishing, Spam, and
Ham: A Large Language Model Approach
- Authors: Suhaima Jamal and Hayden Wimmer
- Abstract summary: We present IPSDM, our model based on fine-tuning the BERT family of models to specifically detect phishing and spam email.
We demonstrate our fine-tuned version, IPSDM, is able to better classify emails in both unbalanced and balanced datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Phishing and spam detection is long standing challenge that has been the
subject of much academic research. Large Language Models (LLM) have vast
potential to transform society and provide new and innovative approaches to
solve well-established challenges. Phishing and spam have caused financial
hardships and lost time and resources to email users all over the world and
frequently serve as an entry point for ransomware threat actors. While
detection approaches exist, especially heuristic-based approaches, LLMs offer
the potential to venture into a new unexplored area for understanding and
solving this challenge. LLMs have rapidly altered the landscape from business,
consumers, and throughout academia and demonstrate transformational potential
for the potential of society. Based on this, applying these new and innovative
approaches to email detection is a rational next step in academic research. In
this work, we present IPSDM, our model based on fine-tuning the BERT family of
models to specifically detect phishing and spam email. We demonstrate our
fine-tuned version, IPSDM, is able to better classify emails in both unbalanced
and balanced datasets. This work serves as an important first step towards
employing LLMs to improve the security of our information systems.
Related papers
- Next-Generation Phishing: How LLM Agents Empower Cyber Attackers [10.067883724547182]
The escalating threat of phishing emails has become increasingly sophisticated with the rise of Large Language Models (LLMs)
As attackers exploit LLMs to craft more convincing and evasive phishing emails, it is crucial to assess the resilience of current phishing defenses.
We conduct a comprehensive evaluation of traditional phishing detectors, such as Gmail Spam Filter, Apache SpamAssassin, and Proofpoint, as well as machine learning models like SVM, Logistic Regression, and Naive Bayes.
Our results reveal notable declines in detection accuracy for rephrased emails across all detectors, highlighting critical weaknesses in current phishing defenses.
arXiv Detail & Related papers (2024-11-21T06:20:29Z) - Combating Phone Scams with LLM-based Detection: Where Do We Stand? [1.8979188847659796]
This research explores the potential of large language models (LLMs) to provide detection of fraudulent phone calls.
LLMs-based detectors can identify potential scams as they occur, offering immediate protection to users.
arXiv Detail & Related papers (2024-09-18T02:14:30Z) - A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends [78.3201480023907]
Large Vision-Language Models (LVLMs) have demonstrated remarkable capabilities across a wide range of multimodal understanding and reasoning tasks.
The vulnerability of LVLMs is relatively underexplored, posing potential security risks in daily usage.
In this paper, we provide a comprehensive review of the various forms of existing LVLM attacks.
arXiv Detail & Related papers (2024-07-10T06:57:58Z) - Benchmarking Trustworthiness of Multimodal Large Language Models: A Comprehensive Study [51.19622266249408]
MultiTrust is the first comprehensive and unified benchmark on the trustworthiness of MLLMs.
Our benchmark employs a rigorous evaluation strategy that addresses both multimodal risks and cross-modal impacts.
Extensive experiments with 21 modern MLLMs reveal some previously unexplored trustworthiness issues and risks.
arXiv Detail & Related papers (2024-06-11T08:38:13Z) - An Explainable Transformer-based Model for Phishing Email Detection: A
Large Language Model Approach [2.8282906214258805]
Phishing email is a serious cyber threat that tries to deceive users by sending false emails with the intention of stealing confidential information or causing financial harm.
Despite extensive academic research, phishing detection remains an ongoing and formidable challenge in the cybersecurity landscape.
We present an optimized, fine-tuned transformer-based DistilBERT model designed for the detection of phishing emails.
arXiv Detail & Related papers (2024-02-21T15:23:21Z) - Detecting Scams Using Large Language Models [19.7220607313348]
Large Language Models (LLMs) have gained prominence in various applications, including security.
This paper explores the utility of LLMs in scam detection, a critical aspect of cybersecurity.
We propose a novel use case for LLMs to identify scams, such as phishing, advance fee fraud, and romance scams.
arXiv Detail & Related papers (2024-02-05T16:13:54Z) - A Survey on Detection of LLMs-Generated Content [97.87912800179531]
The ability to detect LLMs-generated content has become of paramount importance.
We aim to provide a detailed overview of existing detection strategies and benchmarks.
We also posit the necessity for a multi-faceted approach to defend against various attacks.
arXiv Detail & Related papers (2023-10-24T09:10:26Z) - Privacy in Large Language Models: Attacks, Defenses and Future Directions [84.73301039987128]
We analyze the current privacy attacks targeting large language models (LLMs) and categorize them according to the adversary's assumed capabilities.
We present a detailed overview of prominent defense strategies that have been developed to counter these privacy attacks.
arXiv Detail & Related papers (2023-10-16T13:23:54Z) - Factuality Challenges in the Era of Large Language Models [113.3282633305118]
Large Language Models (LLMs) generate false, erroneous, or misleading content.
LLMs can be exploited for malicious applications.
This poses a significant challenge to society in terms of the potential deception of users.
arXiv Detail & Related papers (2023-10-08T14:55:02Z) - Automatically Correcting Large Language Models: Surveying the landscape
of diverse self-correction strategies [104.32199881187607]
Large language models (LLMs) have demonstrated remarkable performance across a wide array of NLP tasks.
A promising approach to rectify these flaws is self-correction, where the LLM itself is prompted or guided to fix problems in its own output.
This paper presents a comprehensive review of this emerging class of techniques.
arXiv Detail & Related papers (2023-08-06T18:38:52Z) - Spear Phishing With Large Language Models [3.2634122554914002]
This study explores how large language models (LLMs) can be used for spear phishing.
I create unique spear phishing messages for over 600 British Members of Parliament using OpenAI's GPT-3.5 and GPT-4 models.
My findings provide some evidence that these messages are not only realistic but also cost-effective, with each email costing only a fraction of a cent to generate.
arXiv Detail & Related papers (2023-05-11T16:55:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.