PhishSnap: Image-Based Phishing Detection Using Perceptual Hashing
- URL: http://arxiv.org/abs/2512.02243v1
- Date: Mon, 01 Dec 2025 22:15:12 GMT
- Title: PhishSnap: Image-Based Phishing Detection Using Perceptual Hashing
- Authors: Md Abdul Ahad Minhaz, Zannatul Zahan Meem, Md. Shohrab Hossain,
- Abstract summary: Phishing remains one of the most prevalent online threats, exploiting human trust to harvest sensitive credentials.<n>Existing URL- and HTML-based detection systems struggle against obfuscation and visual deception.<n>This paper presents textbfPhishSnap, a privacy-preserving, on-device phishing detection system leveraging perceptual hashing (pHash)
- Score: 0.49812879456944986
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Phishing remains one of the most prevalent online threats, exploiting human trust to harvest sensitive credentials. Existing URL- and HTML-based detection systems struggle against obfuscation and visual deception. This paper presents \textbf{PhishSnap}, a privacy-preserving, on-device phishing detection system leveraging perceptual hashing (pHash). Implemented as a browser extension, PhishSnap captures webpage screenshots, computes visual hashes, and compares them against legitimate templates to identify visually similar phishing attempts. A \textbf{2024 dataset of 10,000 URLs} (70\%/20\%/10\% train/validation/test) was collected from PhishTank and Netcraft. Due to security takedowns, a subset of phishing pages was unavailable, reducing dataset diversity. The system achieved \textbf{0.79 accuracy}, \textbf{0.76 precision}, and \textbf{0.78 recall}, showing that visual similarity remains a viable anti-phishing measure. The entire inference process occurs locally, ensuring user privacy and minimal latency.
Related papers
- Characterizing Phishing Pages by JavaScript Capabilities [77.64740286751834]
This paper aims to aid researchers and analysts by automatically differentiating groups of phishing pages based on the underlying kit.<n>For kit detection, our system has an accuracy of 97% on a ground-truth dataset of 548 kit families deployed across 4,562 phishing URLs.<n>We find that UI interactivity and basic fingerprinting are universal techniques, present in 90% and 80% of the clusters.
arXiv Detail & Related papers (2025-09-16T15:39:23Z) - Phish-Blitz: Advancing Phishing Detection with Comprehensive Webpage Resource Collection and Visual Integrity Preservation [0.03262230127283452]
We introduce Phish-Blitz, a tool that downloads phishing and legitimate webpages along with their associated resources, such as screenshots.<n>Unlike existing tools, Phish-Blitz captures live webpage screenshots and updates resource file paths to maintain the original visual integrity of the webpage.<n>We provide a dataset containing 8,809 legitimate and 5,000 phishing webpages, including all associated resources.
arXiv Detail & Related papers (2025-09-10T08:13:49Z) - Clean Image May be Dangerous: Data Poisoning Attacks Against Deep Hashing [71.30876587855867]
We show that even clean query images can be dangerous, inducing malicious target retrieval results, like undesired or illegal images.<n>Specifically, we first train a surrogate model to simulate the behavior of the target deep hashing model.<n>Then, a strict gradient matching strategy is proposed to generate the poisoned images.
arXiv Detail & Related papers (2025-03-27T07:54:27Z) - Protecting Onion Service Users Against Phishing [1.6435014180036467]
Phishing websites are a common phenomenon among Tor onion services.
phishers exploit that it is tremendously difficult to distinguish phishing from authentic onion domain names.
Operators of onion services devised several strategies to protect their users against phishing.
None protect users against phishing without producing traces about visited services.
arXiv Detail & Related papers (2024-08-14T19:51:30Z) - Position Paper: Think Globally, React Locally -- Bringing Real-time Reference-based Website Phishing Detection on macOS [0.4962561299282114]
The recent surge in phishing attacks keeps undermining the effectiveness of the traditional anti-phishing blacklist approaches.
On-device anti-phishing solutions are gaining popularity as they offer faster phishing detection locally.
We propose a phishing detection solution that uses a combination of computer vision and on-device machine learning models to analyze websites in real time.
arXiv Detail & Related papers (2024-05-28T14:46:03Z) - HashVFL: Defending Against Data Reconstruction Attacks in Vertical
Federated Learning [44.950977556078776]
We propose HashVFL, which integrates hashing and simultaneously achieves learnability, bit balance, and consistency.
Experimental results indicate that HashVFL effectively maintains task performance while defending against data reconstruction attacks.
arXiv Detail & Related papers (2022-12-01T07:19:17Z) - PhishMatch: A Layered Approach for Effective Detection of Phishing URLs [8.658596218544774]
We present a layered anti-phishing defense, PhishMatch, which is robust, accurate, inexpensive, and client-side.
A prototype plugin of PhishMatch, developed for the Chrome browser, was found to be fast and lightweight.
arXiv Detail & Related papers (2021-12-04T03:21:29Z) - Backdoor Attack on Hash-based Image Retrieval via Clean-label Data
Poisoning [54.15013757920703]
We propose the confusing perturbations-induced backdoor attack (CIBA)
It injects a small number of poisoned images with the correct label into the training data.
We have conducted extensive experiments to verify the effectiveness of our proposed CIBA.
arXiv Detail & Related papers (2021-09-18T07:56:59Z) - Detecting Phishing Sites -- An Overview [0.0]
Phishing is one of the most severe cyber-attacks where researchers are interested to find a solution.
To minimize the damage caused by phishing must be detected as early as possible.
There are various phishing detection techniques based on white-list, black-list, content-based, URL-based, visual-similarity and machine-learning.
arXiv Detail & Related papers (2021-03-23T19:16:03Z) - Backdoor Attack against Speaker Verification [86.43395230456339]
We show that it is possible to inject the hidden backdoor for infecting speaker verification models by poisoning the training data.
We also demonstrate that existing backdoor attacks cannot be directly adopted in attacking speaker verification.
arXiv Detail & Related papers (2020-10-22T11:10:08Z) - Phishing and Spear Phishing: examples in Cyber Espionage and techniques
to protect against them [91.3755431537592]
Phishing attacks have become the most used technique in the online scams, initiating more than 91% of cyberattacks, from 2012 onwards.
This study reviews how Phishing and Spear Phishing attacks are carried out by the phishers, through 5 steps which magnify the outcome.
arXiv Detail & Related papers (2020-05-31T18:10:09Z) - Targeted Attack for Deep Hashing based Retrieval [57.582221494035856]
We propose a novel method, dubbed deep hashing targeted attack (DHTA), to study the targeted attack on such retrieval.
We first formulate the targeted attack as a point-to-set optimization, which minimizes the average distance between the hash code of an adversarial example and those of a set of objects with the target label.
To balance the performance and perceptibility, we propose to minimize the Hamming distance between the hash code of the adversarial example and the anchor code under the $ellinfty$ restriction on the perturbation.
arXiv Detail & Related papers (2020-04-15T08:36:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.