Related papers: PhishMatch: A Layered Approach for Effective Detection of Phishing URLs

PhishMatch: A Layered Approach for Effective Detection of Phishing URLs

URL: http://arxiv.org/abs/2112.02226v1
Date: Sat, 4 Dec 2021 03:21:29 GMT
Title: PhishMatch: A Layered Approach for Effective Detection of Phishing URLs
Authors: Harshal Tupsamudre, Sparsh Jain, Sachin Lodha
Abstract summary: We present a layered anti-phishing defense, PhishMatch, which is robust, accurate, inexpensive, and client-side. A prototype plugin of PhishMatch, developed for the Chrome browser, was found to be fast and lightweight.
Score: 8.658596218544774
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Phishing attacks continue to be a significant threat on the Internet. Prior studies show that it is possible to determine whether a website is phishing or not just by analyzing its URL more carefully. A major advantage of the URL based approach is that it can identify a phishing website even before the web page is rendered in the browser, thus avoiding other potential problems such as cryptojacking and drive-by downloads. However, traditional URL based approaches have their limitations. Blacklist based approaches are prone to zero-hour phishing attacks, advanced machine learning based approaches consume high resources, and other approaches send the URL to a remote server which compromises user's privacy. In this paper, we present a layered anti-phishing defense, PhishMatch, which is robust, accurate, inexpensive, and client-side. We design a space-time efficient Aho-Corasick algorithm for exact string matching and n-gram based indexing technique for approximate string matching to detect various cybersquatting techniques in the phishing URL. To reduce false positives, we use a global whitelist and personalized user whitelists. We also determine the context in which the URL is visited and use that information to classify the input URL more accurately. The last component of PhishMatch involves a machine learning model and controlled search engine queries to classify the URL. A prototype plugin of PhishMatch, developed for the Chrome browser, was found to be fast and lightweight. Our evaluation shows that PhishMatch is both efficient and effective.

Related papers

Clean Image May be Dangerous: Data Poisoning Attacks Against Deep Hashing [71.30876587855867]
We show that even clean query images can be dangerous, inducing malicious target retrieval results, like undesired or illegal images. Specifically, we first train a surrogate model to simulate the behavior of the target deep hashing model. Then, a strict gradient matching strategy is proposed to generate the poisoned images.
arXiv Detail & Related papers (2025-03-27T07:54:27Z)
URL Inspection Tasks: Helping Users Detect Phishing Links in Emails [23.377429588655083]
We develop a novel phishing defense mechanism based on URL inspection tasks. These tasks require users to interact with, and understand, the basic URL structure. Results show that these tasks significantly decrease the rate of successful phishing attempts.
arXiv Detail & Related papers (2025-02-27T16:20:21Z)
Web Phishing Net (WPN): A scalable machine learning approach for real-time phishing campaign detection [0.0]
Phishing is the most prevalent type of cyber-attack today and is recognized as the leading source of data breaches. In this paper, we propose an unsupervised learning approach that is fast but scalable. It is able to detect entire campaigns at a time with a high detection rate while preserving user privacy.
arXiv Detail & Related papers (2025-02-17T15:06:56Z)
NoPhish: Efficient Chrome Extension for Phishing Detection Using Machine Learning Techniques [0.0]
"NoPhish" shall identify a phishing webpage based on several Machine Learning techniques. We have used the training dataset from "PhishTank" and extracted the 22 most popular features. The performance results show that Random Forest delivers the best precision.
arXiv Detail & Related papers (2024-09-01T18:59:14Z)
PhishLang: A Lightweight, Client-Side Phishing Detection Framework using MobileBERT for Real-Time, Explainable Threat Mitigation [3.014087730099599]
In this paper, we introduce PhishLang, an open-source, lightweight language model specifically designed for phishing website detection. We use MobileBERT, a fast and memory-efficient variant of the BERT architecture, to learn granular features characteristic of phishing attacks. Over a 3.5-month testing period, PhishLang successfully identified 25,796 phishing URLs.
arXiv Detail & Related papers (2024-08-11T01:14:13Z)
AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries. We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots. We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z)
Phishing URL Detection: A Network-based Approach Robust to Evasion [17.786802845563745]
We present a network-based inference method to accurately detect phishing URLs camouflaged with legitimate patterns. Our method consistently shows better detection performance throughout various experimental tests than state-of-the-art methods.
arXiv Detail & Related papers (2022-09-03T16:09:05Z)
Towards Web Phishing Detection Limitations and Mitigation [21.738240693843295]
We show how phishing sites bypass Machine Learning-based detection. Experiments with 100K phishing/benign sites show promising accuracy (98.8%) We propose Anti-SubtlePhish, a more resilient model based on logistic regression.
arXiv Detail & Related papers (2022-04-03T04:26:04Z)
Detecting Phishing Sites -- An Overview [0.0]
Phishing is one of the most severe cyber-attacks where researchers are interested to find a solution. To minimize the damage caused by phishing must be detected as early as possible. There are various phishing detection techniques based on white-list, black-list, content-based, URL-based, visual-similarity and machine-learning.
arXiv Detail & Related papers (2021-03-23T19:16:03Z)
Being Single Has Benefits. Instance Poisoning to Deceive Malware Classifiers [47.828297621738265]
We show how an attacker can launch a sophisticated and efficient poisoning attack targeting the dataset used to train a malware classifier. As opposed to other poisoning attacks in the malware detection domain, our attack does not focus on malware families but rather on specific malware instances that contain an implanted trigger. We propose a comprehensive detection approach that could serve as a future sophisticated defense against this newly discovered severe threat.
arXiv Detail & Related papers (2020-10-30T15:27:44Z)
Robust and Verifiable Information Embedding Attacks to Deep Neural Networks via Error-Correcting Codes [81.85509264573948]
In the era of deep learning, a user often leverages a third-party machine learning tool to train a deep neural network (DNN) classifier. In an information embedding attack, an attacker is the provider of a malicious third-party machine learning tool. In this work, we aim to design information embedding attacks that are verifiable and robust against popular post-processing methods.
arXiv Detail & Related papers (2020-10-26T17:42:42Z)
Backdoor Attack against Speaker Verification [86.43395230456339]
We show that it is possible to inject the hidden backdoor for infecting speaker verification models by poisoning the training data. We also demonstrate that existing backdoor attacks cannot be directly adopted in attacking speaker verification.
arXiv Detail & Related papers (2020-10-22T11:10:08Z)
Phishing and Spear Phishing: examples in Cyber Espionage and techniques to protect against them [91.3755431537592]
Phishing attacks have become the most used technique in the online scams, initiating more than 91% of cyberattacks, from 2012 onwards. This study reviews how Phishing and Spear Phishing attacks are carried out by the phishers, through 5 steps which magnify the outcome.
arXiv Detail & Related papers (2020-05-31T18:10:09Z)
Targeted Attack for Deep Hashing based Retrieval [57.582221494035856]
We propose a novel method, dubbed deep hashing targeted attack (DHTA), to study the targeted attack on such retrieval. We first formulate the targeted attack as a point-to-set optimization, which minimizes the average distance between the hash code of an adversarial example and those of a set of objects with the target label. To balance the performance and perceptibility, we propose to minimize the Hamming distance between the hash code of the adversarial example and the anchor code under the $ellinfty$ restriction on the perturbation.
arXiv Detail & Related papers (2020-04-15T08:36:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.