Targeted Honeyword Generation with Language Models
- URL: http://arxiv.org/abs/2208.06946v1
- Date: Mon, 15 Aug 2022 00:06:29 GMT
- Title: Targeted Honeyword Generation with Language Models
- Authors: Fangyi Yu and Miguel Vargas Martin
- Abstract summary: Honeywords are fictitious passwords inserted into databases to identify password breaches.
Major difficulty is how to produce honeywords that are difficult to distinguish from real passwords.
- Score: 5.165256397719443
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Honeywords are fictitious passwords inserted into databases in order to
identify password breaches. The major difficulty is how to produce honeywords
that are difficult to distinguish from real passwords. Although the generation
of honeywords has been widely investigated in the past, the majority of
existing research assumes attackers have no knowledge of the users. These
honeyword generating techniques (HGTs) may utterly fail if attackers exploit
users' personally identifiable information (PII) and the real passwords include
users' PII. In this paper, we propose to build a more secure and trustworthy
authentication system that employs off-the-shelf pre-trained language models
which require no further training on real passwords to produce honeywords while
retaining the PII of the associated real password, therefore significantly
raising the bar for attackers.
We conducted a pilot experiment in which individuals are asked to distinguish
between authentic passwords and honeywords when the username is provided for
GPT-3 and a tweaking technique. Results show that it is extremely difficult to
distinguish the real passwords from the artifical ones for both techniques. We
speculate that a larger sample size could reveal a significant difference
between the two HGT techniques, favouring our proposed approach.
Related papers
- PassTSL: Modeling Human-Created Passwords through Two-Stage Learning [7.287089766975719]
We propose PassTSL (modeling human-created Passwords through Two-Stage Learning), inspired by the popular pretraining-finetuning framework in NLP and deep learning (DL)
PassTSL outperforms five state-of-the-art (SOTA) password cracking methods on password guessing by a significant margin ranging from 4.11% to 64.69% at the maximum point.
Based on PassTSL, we also implemented a password strength meter (PSM), and our experiments showed that it was able to estimate password strength more accurately.
arXiv Detail & Related papers (2024-07-19T09:23:30Z) - Nudging Users to Change Breached Passwords Using the Protection Motivation Theory [58.87688846800743]
We draw on the Protection Motivation Theory (PMT) to design nudges that encourage users to change breached passwords.
Our study contributes to PMT's application in security research and provides concrete design implications for improving compromised credential notifications.
arXiv Detail & Related papers (2024-05-24T07:51:15Z) - Protecting Copyrighted Material with Unique Identifiers in Large Language Model Training [55.321010757641524]
A major public concern regarding the training of large language models (LLMs) is whether they abusing copyrighted online text.
Previous membership inference methods may be misled by similar examples in vast amounts of training data.
We propose an alternative textitinsert-and-detection methodology, advocating that web users and content platforms employ textbftextitunique identifiers.
arXiv Detail & Related papers (2024-03-23T06:36:32Z) - PassViz: A Visualisation System for Analysing Leaked Passwords [2.2530496464901106]
PassViz is a command-line tool for visualising and analysing leaked passwords in a 2-D space.
We show how PassViz can be used to visually analyse different aspects of leaked passwords and to facilitate the discovery of previously unknown password patterns.
arXiv Detail & Related papers (2023-09-22T16:06:26Z) - The Impact of Exposed Passwords on Honeyword Efficacy [14.697588929837282]
Honeywords are decoy passwords that can be added to a credential database.
If a login attempt uses a honeyword, this indicates that the site's credential database has been leaked.
arXiv Detail & Related papers (2023-09-19T05:10:02Z) - PassGPT: Password Modeling and (Guided) Generation with Large Language
Models [59.11160990637616]
We present PassGPT, a large language model trained on password leaks for password generation.
We also introduce the concept of guided password generation, where we leverage PassGPT sampling procedure to generate passwords matching arbitrary constraints.
arXiv Detail & Related papers (2023-06-02T13:49:53Z) - RiDDLE: Reversible and Diversified De-identification with Latent
Encryptor [57.66174700276893]
This work presents RiDDLE, short for Reversible and Diversified De-identification with Latent Encryptor.
Built upon a pre-learned StyleGAN2 generator, RiDDLE manages to encrypt and decrypt the facial identity within the latent space.
arXiv Detail & Related papers (2023-03-09T11:03:52Z) - On Deep Learning in Password Guessing, a Survey [4.1499725848998965]
This paper compares various deep learning-based password guessing approaches that do not require domain knowledge or assumptions about users' password structures and combinations.
We propose a promising research experimental design on using variations of IWGAN on password guessing under non-targeted offline attacks.
arXiv Detail & Related papers (2022-08-22T15:48:35Z) - GNPassGAN: Improved Generative Adversarial Networks For Trawling Offline
Password Guessing [5.165256397719443]
This paper reviews various deep learning-based password guessing approaches.
It also introduces GNPassGAN, a password guessing tool built on generative adversarial networks for trawling offline attacks.
In comparison to the state-of-the-art PassGAN model, GNPassGAN is capable of guessing 88.03% more passwords and generating 31.69% fewer duplicates.
arXiv Detail & Related papers (2022-08-14T23:51:52Z) - Skeptic: Automatic, Justified and Privacy-Preserving Password Composition Policy Selection [44.040106718326605]
The choice of password composition policy to enforce on a password-protected system represents a critical security decision.
In practice, this choice is not usually rigorous or justifiable, with a tendency for system administrators to choose password composition policies based on intuition alone.
We propose a novel methodology that draws on password probability distributions constructed from large sets of real-world password data.
arXiv Detail & Related papers (2020-07-07T22:12:13Z) - Lost in Disclosure: On The Inference of Password Composition Policies [43.17794589897313]
We study how password composition policies influence the distribution of user-chosen passwords on a system.
We suggest a simple approach that produces more reliable results.
We present pol-infer, a tool that implements this approach, and demonstrates its use inferring password composition policies.
arXiv Detail & Related papers (2020-03-12T15:27:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.