PassFlow: Guessing Passwords with Generative Flows
- URL: http://arxiv.org/abs/2105.06165v1
- Date: Thu, 13 May 2021 09:50:36 GMT
- Title: PassFlow: Guessing Passwords with Generative Flows
- Authors: Giulio Pagnotta, Dorjan Hitaj, Fabio De Gaspari, Luigi V. Mancini
- Abstract summary: We propose a flow-based generative model approach to password guessing.
Flow-based models allow for precise log-likelihood optimization, which enables exact latent variable inference.
We show that flow-based networks are able to accurately model the original passwords distribution.
- Score: 1.1470070927586016
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in generative machine learning models rekindled research
interest in the area of password guessing. Data-driven password guessing
approaches based on GANs, language models and deep latent variable models show
impressive generalization performance and offer compelling properties for the
task of password guessing. In this paper, we propose a flow-based generative
model approach to password guessing. Flow-based models allow for precise
log-likelihood computation and optimization, which enables exact latent
variable inference. Additionally, flow-based models provide meaningful latent
space representation, which enables operations such as exploration of specific
subspaces of the latent space and interpolation. We demonstrate the
applicability of generative flows to the context of password guessing,
departing from previous applications of flow networks which are mainly limited
to the continuous space of image generation. We show that the above-mentioned
properties allow flow-based models to outperform deep latent variable model
approaches and remain competitive with state-of-the-art GANs in the password
guessing task, while using a training set that is orders of magnitudes smaller
than that of previous art. Furthermore, a qualitative analysis of the generated
samples shows that flow-based networks are able to accurately model the
original passwords distribution, with even non-matched samples closely
resembling human-like passwords.
Related papers
- Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions.
We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training.
As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z) - GFlowNet-EM for learning compositional latent variable models [115.96660869630227]
A key tradeoff in modeling the posteriors over latents is between expressivity and tractable optimization.
We propose the use of GFlowNets, algorithms for sampling from an unnormalized density.
By training GFlowNets to sample from the posterior over latents, we take advantage of their strengths as amortized variational algorithms.
arXiv Detail & Related papers (2023-02-13T18:24:21Z) - SIReN-VAE: Leveraging Flows and Amortized Inference for Bayesian
Networks [2.8597160727750564]
This work explores incorporating arbitrary dependency structures, as specified by Bayesian networks, into VAEs.
This is achieved by extending both the prior and inference network with graphical residual flows.
We compare our model's performance on several synthetic datasets and show its potential in data-sparse settings.
arXiv Detail & Related papers (2022-04-23T10:31:08Z) - Generative Deep Learning Techniques for Password Generation [0.5249805590164902]
We study a broad collection of deep learning and probabilistic based models in the light of password guessing.
We provide novel generative deep-learning models in terms of variational autoencoders exhibiting state-of-art sampling performance.
We perform a thorough empirical analysis in a unified controlled framework over well-known datasets.
arXiv Detail & Related papers (2020-12-10T14:11:45Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z) - A Probabilistic Generative Model for Typographical Analysis of Early
Modern Printing [44.62884731273421]
We propose a deep and interpretable probabilistic generative model to analyze glyph shapes in printed Early Modern documents.
Our approach introduces a neural editor model that first generates well-understood printing perturbations from template parameters via interpertable latent variables.
We show that our approach outperforms rigid interpretable clustering baselines (Ocular) and overly-flexible deep generative models (VAE) alike on the task of completely unsupervised discovery of typefaces in mixed-font documents.
arXiv Detail & Related papers (2020-05-04T17:01:11Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z) - Augmented Normalizing Flows: Bridging the Gap Between Generative Flows
and Latent Variable Models [11.206144910991481]
We propose a new family of generative flows on an augmented data space, with an aim to improve expressivity without drastically increasing the computational cost of sampling and evaluation of a lower bound on the likelihood.
We demonstrate state-of-the-art performance on standard benchmarks of flow-based generative modeling.
arXiv Detail & Related papers (2020-02-17T17:45:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.