PassGPT: Password Modeling and (Guided) Generation with Large Language
Models
- URL: http://arxiv.org/abs/2306.01545v2
- Date: Wed, 14 Jun 2023 22:45:28 GMT
- Title: PassGPT: Password Modeling and (Guided) Generation with Large Language
Models
- Authors: Javier Rando and Fernando Perez-Cruz and Briland Hitaj
- Abstract summary: We present PassGPT, a large language model trained on password leaks for password generation.
We also introduce the concept of guided password generation, where we leverage PassGPT sampling procedure to generate passwords matching arbitrary constraints.
- Score: 59.11160990637616
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Large language models (LLMs) successfully model natural language from vast
amounts of text without the need for explicit supervision. In this paper, we
investigate the efficacy of LLMs in modeling passwords. We present PassGPT, a
LLM trained on password leaks for password generation. PassGPT outperforms
existing methods based on generative adversarial networks (GAN) by guessing
twice as many previously unseen passwords. Furthermore, we introduce the
concept of guided password generation, where we leverage PassGPT sampling
procedure to generate passwords matching arbitrary constraints, a feat lacking
in current GAN-based strategies. Lastly, we conduct an in-depth analysis of the
entropy and probability distribution that PassGPT defines over passwords and
discuss their use in enhancing existing password strength estimators.
Related papers
- PassTSL: Modeling Human-Created Passwords through Two-Stage Learning [7.287089766975719]
We propose PassTSL (modeling human-created Passwords through Two-Stage Learning), inspired by the popular pretraining-finetuning framework in NLP and deep learning (DL)
PassTSL outperforms five state-of-the-art (SOTA) password cracking methods on password guessing by a significant margin ranging from 4.11% to 64.69% at the maximum point.
Based on PassTSL, we also implemented a password strength meter (PSM), and our experiments showed that it was able to estimate password strength more accurately.
arXiv Detail & Related papers (2024-07-19T09:23:30Z) - Nudging Users to Change Breached Passwords Using the Protection Motivation Theory [58.87688846800743]
We draw on the Protection Motivation Theory (PMT) to design nudges that encourage users to change breached passwords.
Our study contributes to PMT's application in security research and provides concrete design implications for improving compromised credential notifications.
arXiv Detail & Related papers (2024-05-24T07:51:15Z) - PagPassGPT: Pattern Guided Password Guessing via Generative Pretrained Transformer [8.591143235694826]
We present PagPassGPT, a password guessing model constructed on Generative Pretrained Transformer (GPT)
It can perform pattern guided guessing by incorporating pattern structure information as background knowledge, resulting in a significant increase in the hit rate.
We also propose D&C-GEN to reduce the repeat rate of generated passwords, which adopts the concept of a divide-and-conquer approach.
arXiv Detail & Related papers (2024-04-07T09:06:14Z) - Search-based Ordered Password Generation of Autoregressive Neural Networks [0.0]
We build SOPGesGPT, a password guessing model based on GPT, using SOPG to generate passwords.
Compared with the most influential models OMEN, FLA, PassGAN, VAEPass, experiments show that SOPGesGPT is far ahead in terms of both effective rate and cover rate.
arXiv Detail & Related papers (2024-03-15T01:30:38Z) - CodeChameleon: Personalized Encryption Framework for Jailbreaking Large
Language Models [49.60006012946767]
We propose CodeChameleon, a novel jailbreak framework based on personalized encryption tactics.
We conduct extensive experiments on 7 Large Language Models, achieving state-of-the-art average Attack Success Rate (ASR)
Remarkably, our method achieves an 86.6% ASR on GPT-4-1106.
arXiv Detail & Related papers (2024-02-26T16:35:59Z) - A Quality-based Syntactic Template Retriever for
Syntactically-controlled Paraphrase Generation [67.98367574025797]
Existing syntactically-controlled paraphrase generation models perform promisingly with human-annotated or well-chosen syntactic templates.
The prohibitive cost makes it unfeasible to manually design decent templates for every source sentence.
We propose a novel Quality-based Syntactic Template Retriever (QSTR) to retrieve templates based on the quality of the to-be-generated paraphrases.
arXiv Detail & Related papers (2023-10-20T03:55:39Z) - PassViz: A Visualisation System for Analysing Leaked Passwords [2.2530496464901106]
PassViz is a command-line tool for visualising and analysing leaked passwords in a 2-D space.
We show how PassViz can be used to visually analyse different aspects of leaked passwords and to facilitate the discovery of previously unknown password patterns.
arXiv Detail & Related papers (2023-09-22T16:06:26Z) - Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs.
Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z) - Universal Neural-Cracking-Machines: Self-Configurable Password Models
from Auxiliary Data [21.277402919534566]
"universal password model" is a password model that adapts its guessing strategy based on the target system.
It exploits users' auxiliary information, such as email addresses, as a proxy signal to predict the underlying password distribution.
arXiv Detail & Related papers (2023-01-18T16:12:04Z) - Byte Pair Encoding is Suboptimal for Language Model Pretraining [49.30780227162387]
We analyze differences between unigram LM tokenization and byte-pair encoding (BPE)
We find that the unigram LM tokenization method matches or outperforms BPE across downstream tasks and two languages.
We hope that developers of future pretrained LMs will consider adopting the unigram LM method over the more prevalent BPE.
arXiv Detail & Related papers (2020-04-07T21:21:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.