Related papers: PassFlow: Guessing Passwords with Generative Flows

PassFlow: Guessing Passwords with Generative Flows

URL: http://arxiv.org/abs/2105.06165v1
Date: Thu, 13 May 2021 09:50:36 GMT
Title: PassFlow: Guessing Passwords with Generative Flows
Authors: Giulio Pagnotta, Dorjan Hitaj, Fabio De Gaspari, Luigi V. Mancini
Abstract summary: We propose a flow-based generative model approach to password guessing. Flow-based models allow for precise log-likelihood optimization, which enables exact latent variable inference. We show that flow-based networks are able to accurately model the original passwords distribution.
Score: 1.1470070927586016
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in generative machine learning models rekindled research interest in the area of password guessing. Data-driven password guessing approaches based on GANs, language models and deep latent variable models show impressive generalization performance and offer compelling properties for the task of password guessing. In this paper, we propose a flow-based generative model approach to password guessing. Flow-based models allow for precise log-likelihood computation and optimization, which enables exact latent variable inference. Additionally, flow-based models provide meaningful latent space representation, which enables operations such as exploration of specific subspaces of the latent space and interpolation. We demonstrate the applicability of generative flows to the context of password guessing, departing from previous applications of flow networks which are mainly limited to the continuous space of image generation. We show that the above-mentioned properties allow flow-based models to outperform deep latent variable model approaches and remain competitive with state-of-the-art GANs in the password guessing task, while using a training set that is orders of magnitudes smaller than that of previous art. Furthermore, a qualitative analysis of the generated samples shows that flow-based networks are able to accurately model the original passwords distribution, with even non-matched samples closely resembling human-like passwords.

Related papers

InverseScope: Scalable Activation Inversion for Interpreting Large Language Models [6.841889611296894]
InverseScope is an assumption-light and scalable framework for interpreting neural activations via input inversion.<n>To address the inefficiency of sampling in high-dimensional spaces, we propose a novel conditional generation architecture.<n>We also introduce a quantitative evaluation protocol that tests interpretability hypotheses using feature consistency rate computed over the sampled inputs.
arXiv Detail & Related papers (2025-06-09T03:59:28Z)
Generative Recommendation with Continuous-Token Diffusion [11.23267167046234]
We propose a novel framework for large language model (LLM)-based recommender systems (RecSys) DeftRec incorporates textbfdenoising ditextbfffusion models to enable LLM-based RecSys to seamlessly support continuous textbftoken as input and target. Given a continuous token as output, recommendations can be easily generated through score-based retrieval.
arXiv Detail & Related papers (2025-04-16T12:01:03Z)
Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
Generative retrieval reformulates retrieval as an autoregressive generation task, where large language models generate target documents directly from a query.<n>We systematically investigate training and inference scaling laws in generative retrieval, exploring how model size, training data scale, and inference-time compute jointly influence performance.
arXiv Detail & Related papers (2025-03-24T17:59:03Z)
Exploring Representation-Aligned Latent Space for Better Generation [86.45670422239317]
We introduce ReaLS, which integrates semantic priors to improve generation performance. We show that fundamental DiT and SiT trained on ReaLS can achieve a 15% improvement in FID metric. The enhanced semantic latent space enables more perceptual downstream tasks, such as segmentation and depth estimation.
arXiv Detail & Related papers (2025-02-01T07:42:12Z)
Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions. We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z)
GFlowNet-EM for learning compositional latent variable models [115.96660869630227]
A key tradeoff in modeling the posteriors over latents is between expressivity and tractable optimization. We propose the use of GFlowNets, algorithms for sampling from an unnormalized density. By training GFlowNets to sample from the posterior over latents, we take advantage of their strengths as amortized variational algorithms.
arXiv Detail & Related papers (2023-02-13T18:24:21Z)
SIReN-VAE: Leveraging Flows and Amortized Inference for Bayesian Networks [2.8597160727750564]
This work explores incorporating arbitrary dependency structures, as specified by Bayesian networks, into VAEs. This is achieved by extending both the prior and inference network with graphical residual flows. We compare our model's performance on several synthetic datasets and show its potential in data-sparse settings.
arXiv Detail & Related papers (2022-04-23T10:31:08Z)
Generative Deep Learning Techniques for Password Generation [0.5249805590164902]
We study a broad collection of deep learning and probabilistic based models in the light of password guessing. We provide novel generative deep-learning models in terms of variational autoencoders exhibiting state-of-art sampling performance. We perform a thorough empirical analysis in a unified controlled framework over well-known datasets.
arXiv Detail & Related papers (2020-12-10T14:11:45Z)
Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting. Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking. We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
Goal-directed Generation of Discrete Structures with Conditional Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward. We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z)
Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images. In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner. We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)
A Probabilistic Generative Model for Typographical Analysis of Early Modern Printing [44.62884731273421]
We propose a deep and interpretable probabilistic generative model to analyze glyph shapes in printed Early Modern documents. Our approach introduces a neural editor model that first generates well-understood printing perturbations from template parameters via interpertable latent variables. We show that our approach outperforms rigid interpretable clustering baselines (Ocular) and overly-flexible deep generative models (VAE) alike on the task of completely unsupervised discovery of typefaces in mixed-font documents.
arXiv Detail & Related papers (2020-05-04T17:01:11Z)
Improve Variational Autoencoder for Text Generationwith Discrete Latent Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning. VAEs tend to ignore latent variables with a strong auto-regressive decoder. We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)
Augmented Normalizing Flows: Bridging the Gap Between Generative Flows and Latent Variable Models [11.206144910991481]
We propose a new family of generative flows on an augmented data space, with an aim to improve expressivity without drastically increasing the computational cost of sampling and evaluation of a lower bound on the likelihood. We demonstrate state-of-the-art performance on standard benchmarks of flow-based generative modeling.
arXiv Detail & Related papers (2020-02-17T17:45:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.