When AI Defeats Password Deception! A Deep Learning Framework to Distinguish Passwords and Honeywords
- URL: http://arxiv.org/abs/2407.16964v1
- Date: Wed, 24 Jul 2024 03:02:57 GMT
- Title: When AI Defeats Password Deception! A Deep Learning Framework to Distinguish Passwords and Honeywords
- Authors: Jimmy Dani, Brandon McCulloh, Nitesh Saxena,
- Abstract summary: "Honeywords" have emerged as a promising defense mechanism for detecting data breaches and foiling offline dictionary attacks.
We propose PassFilter, a novel deep learning (DL) based attack framework.
PassFilter is trained with a set of previously collected or adversarially generated passwords and honeywords.
- Score: 1.460362586787935
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: "Honeywords" have emerged as a promising defense mechanism for detecting data breaches and foiling offline dictionary attacks (ODA) by deceiving attackers with false passwords. In this paper, we propose PassFilter, a novel deep learning (DL) based attack framework, fundamental in its ability to identify passwords from a set of sweetwords associated with a user account, effectively challenging a variety of honeywords generation techniques (HGTs). The DL model in PassFilter is trained with a set of previously collected or adversarially generated passwords and honeywords, and carefully orchestrated to predict whether a sweetword is the password or a honeyword. Our model can compromise the security of state-of-the-art, heuristics-based, and representation learning-based HGTs proposed by Dionysiou et al. Specifically, our analysis with nine publicly available password datasets shows that PassFilter significantly outperforms the baseline random guessing success rate of 5%, achieving 6.10% to 52.78% on the 1st guessing attempt, considering 20 sweetwords per account. This success rate rapidly increases with additional login attempts before account lock-outs, often allowed on many real-world online services to maintain reasonable usability. For example, it ranges from 41.78% to 96.80% for five attempts, and from 72.87% to 99.00% for ten attempts, compared to 25% and 50% random guessing, respectively. We also examined PassFilter against general-purpose language models used for honeyword generation, like those proposed by Yu et al. These honeywords also proved vulnerable to our attack, with success rates of 14.19% for 1st guessing attempt, increasing to 30.23%, 41.70%, and 63.10% after 3rd, 5th, and 10th guessing attempts, respectively. Our findings demonstrate the effectiveness of DL model deployed in PassFilter in breaching state-of-the-art HGTs and compromising password security based on ODA.
Related papers
- PassTSL: Modeling Human-Created Passwords through Two-Stage Learning [7.287089766975719]
We propose PassTSL (modeling human-created Passwords through Two-Stage Learning), inspired by the popular pretraining-finetuning framework in NLP and deep learning (DL)
PassTSL outperforms five state-of-the-art (SOTA) password cracking methods on password guessing by a significant margin ranging from 4.11% to 64.69% at the maximum point.
Based on PassTSL, we also implemented a password strength meter (PSM), and our experiments showed that it was able to estimate password strength more accurately.
arXiv Detail & Related papers (2024-07-19T09:23:30Z) - Nudging Users to Change Breached Passwords Using the Protection Motivation Theory [58.87688846800743]
We draw on the Protection Motivation Theory (PMT) to design nudges that encourage users to change breached passwords.
Our study contributes to PMT's application in security research and provides concrete design implications for improving compromised credential notifications.
arXiv Detail & Related papers (2024-05-24T07:51:15Z) - Search-based Ordered Password Generation of Autoregressive Neural Networks [0.0]
We build SOPGesGPT, a password guessing model based on GPT, using SOPG to generate passwords.
Compared with the most influential models OMEN, FLA, PassGAN, VAEPass, experiments show that SOPGesGPT is far ahead in terms of both effective rate and cover rate.
arXiv Detail & Related papers (2024-03-15T01:30:38Z) - Locally Differentially Private Document Generation Using Zero Shot
Prompting [61.20953109732442]
We propose a locally differentially private mechanism called DP-Prompt to counter author de-anonymization attacks.
When DP-Prompt is used with a powerful language model like ChatGPT (gpt-3.5), we observe a notable reduction in the success rate of de-anonymization attacks.
arXiv Detail & Related papers (2023-10-24T18:25:13Z) - The Impact of Exposed Passwords on Honeyword Efficacy [14.697588929837282]
Honeywords are decoy passwords that can be added to a credential database.
If a login attempt uses a honeyword, this indicates that the site's credential database has been leaked.
arXiv Detail & Related papers (2023-09-19T05:10:02Z) - PassGPT: Password Modeling and (Guided) Generation with Large Language
Models [59.11160990637616]
We present PassGPT, a large language model trained on password leaks for password generation.
We also introduce the concept of guided password generation, where we leverage PassGPT sampling procedure to generate passwords matching arbitrary constraints.
arXiv Detail & Related papers (2023-06-02T13:49:53Z) - To Wake-up or Not to Wake-up: Reducing Keyword False Alarm by Successive
Refinement [58.96644066571205]
We show that existing deep keyword spotting mechanisms can be improved by Successive Refinement.
We show across multiple models with size ranging from 13K parameters to 2.41M parameters, the successive refinement technique reduces FA by up to a factor of 8.
Our proposed approach is "plug-and-play" and can be applied to any deep keyword spotting model.
arXiv Detail & Related papers (2023-04-06T23:49:29Z) - Paraphrasing evades detectors of AI-generated text, but retrieval is an
effective defense [56.077252790310176]
We present a paraphrase generation model (DIPPER) that can paraphrase paragraphs, condition on surrounding context, and control lexical diversity and content reordering.
Using DIPPER to paraphrase text generated by three large language models (including GPT3.5-davinci-003) successfully evades several detectors, including watermarking.
We introduce a simple defense that relies on retrieving semantically-similar generations and must be maintained by a language model API provider.
arXiv Detail & Related papers (2023-03-23T16:29:27Z) - Targeted Honeyword Generation with Language Models [5.165256397719443]
Honeywords are fictitious passwords inserted into databases to identify password breaches.
Major difficulty is how to produce honeywords that are difficult to distinguish from real passwords.
arXiv Detail & Related papers (2022-08-15T00:06:29Z) - GNPassGAN: Improved Generative Adversarial Networks For Trawling Offline
Password Guessing [5.165256397719443]
This paper reviews various deep learning-based password guessing approaches.
It also introduces GNPassGAN, a password guessing tool built on generative adversarial networks for trawling offline attacks.
In comparison to the state-of-the-art PassGAN model, GNPassGAN is capable of guessing 88.03% more passwords and generating 31.69% fewer duplicates.
arXiv Detail & Related papers (2022-08-14T23:51:52Z) - ONION: A Simple and Effective Defense Against Textual Backdoor Attacks [91.83014758036575]
Backdoor attacks are a kind of emergent training-time threat to deep neural networks (DNNs)
In this paper, we propose a simple and effective textual backdoor defense named ONION.
Experiments demonstrate the effectiveness of our model in defending BiLSTM and BERT against five different backdoor attacks.
arXiv Detail & Related papers (2020-11-20T12:17:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.