Password Strength Detection via Machine Learning: Analysis, Modeling, and Evaluation
- URL: http://arxiv.org/abs/2505.16439v1
- Date: Thu, 22 May 2025 09:27:40 GMT
- Title: Password Strength Detection via Machine Learning: Analysis, Modeling, and Evaluation
- Authors: Jiazhi Mo, Hailu Kuang, Xiaoqi Li,
- Abstract summary: This study introduces various methods for system password cracking, outlines password defense strategies, and discusses the application of machine learning in the realm of password security.<n>We extract multiple characteristics of passwords, including length, the number of digits, the number of uppercase and lowercase letters, and the number of special characters.
- Score: 0.8225825738565354
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As network security issues continue gaining prominence, password security has become crucial in safeguarding personal information and network systems. This study first introduces various methods for system password cracking, outlines password defense strategies, and discusses the application of machine learning in the realm of password security. Subsequently, we conduct a detailed public password database analysis, uncovering standard features and patterns among passwords. We extract multiple characteristics of passwords, including length, the number of digits, the number of uppercase and lowercase letters, and the number of special characters. We then experiment with six different machine learning algorithms: support vector machines, logistic regression, neural networks, decision trees, random forests, and stacked models, evaluating each model's performance based on various metrics, including accuracy, recall, and F1 score through model validation and hyperparameter tuning. The evaluation results on the test set indicate that decision trees and stacked models excel in accuracy, recall, and F1 score, making them a practical option for the strong and weak password classification task.
Related papers
- Adversarial Machine Learning for Robust Password Strength Estimation [0.0]
This study focuses on developing robust password strength estimation models using adversarial machine learning.<n>We apply five classification algorithms and use a dataset with more than 670,000 samples of adversarial passwords to train the models.<n>Results demonstrate that adversarial training improves password strength classification accuracy by up to 20% compared to traditional machine learning models.
arXiv Detail & Related papers (2025-05-31T03:54:04Z) - MAYA: Addressing Inconsistencies in Generative Password Guessing through a Unified Benchmark [0.35998666903987897]
We introduce MAYA, a unified, customizable, plug-and-play password benchmarking framework.<n> MAYA provides a standardized approach for evaluating generative password-guessing models.<n>We find sequential models consistently outperform other generative architectures and traditional password-guessing tools.
arXiv Detail & Related papers (2025-04-23T12:16:59Z) - Cryptanalysis via Machine Learning Based Information Theoretic Metrics [58.96805474751668]
We propose two novel applications of machine learning (ML) algorithms to perform cryptanalysis on any cryptosystem.<n>These algorithms can be readily applied in an audit setting to evaluate the robustness of a cryptosystem.<n>We show that our classification model correctly identifies the encryption schemes that are not IND-CPA secure, such as DES, RSA, and AES ECB, with high accuracy.
arXiv Detail & Related papers (2025-01-25T04:53:36Z) - Detecting Machine-Generated Long-Form Content with Latent-Space Variables [54.07946647012579]
Existing zero-shot detectors primarily focus on token-level distributions, which are vulnerable to real-world domain shifts.
We propose a more robust method that incorporates abstract elements, such as event transitions, as key deciding factors to detect machine versus human texts.
arXiv Detail & Related papers (2024-10-04T18:42:09Z) - PassTSL: Modeling Human-Created Passwords through Two-Stage Learning [7.287089766975719]
We propose PassTSL (modeling human-created Passwords through Two-Stage Learning), inspired by the popular pretraining-finetuning framework in NLP and deep learning (DL)
PassTSL outperforms five state-of-the-art (SOTA) password cracking methods on password guessing by a significant margin ranging from 4.11% to 64.69% at the maximum point.
Based on PassTSL, we also implemented a password strength meter (PSM), and our experiments showed that it was able to estimate password strength more accurately.
arXiv Detail & Related papers (2024-07-19T09:23:30Z) - PassGPT: Password Modeling and (Guided) Generation with Large Language
Models [59.11160990637616]
We present PassGPT, a large language model trained on password leaks for password generation.
We also introduce the concept of guided password generation, where we leverage PassGPT sampling procedure to generate passwords matching arbitrary constraints.
arXiv Detail & Related papers (2023-06-02T13:49:53Z) - Backdoor Learning on Sequence to Sequence Models [94.23904400441957]
In this paper, we study whether sequence-to-sequence (seq2seq) models are vulnerable to backdoor attacks.
Specifically, we find by only injecting 0.2% samples of the dataset, we can cause the seq2seq model to generate the designated keyword and even the whole sentence.
Extensive experiments on machine translation and text summarization have been conducted to show our proposed methods could achieve over 90% attack success rate on multiple datasets and models.
arXiv Detail & Related papers (2023-05-03T20:31:13Z) - CodeLMSec Benchmark: Systematically Evaluating and Finding Security
Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks.
Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities.
This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z) - Universal Neural-Cracking-Machines: Self-Configurable Password Models
from Auxiliary Data [21.277402919534566]
"universal password model" is a password model that adapts its guessing strategy based on the target system.
It exploits users' auxiliary information, such as email addresses, as a proxy signal to predict the underlying password distribution.
arXiv Detail & Related papers (2023-01-18T16:12:04Z) - AES Systems Are Both Overstable And Oversensitive: Explaining Why And
Proposing Defenses [66.49753193098356]
We investigate the reason behind the surprising adversarial brittleness of scoring models.
Our results indicate that autoscoring models, despite getting trained as "end-to-end" models, behave like bag-of-words models.
We propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies.
arXiv Detail & Related papers (2021-09-24T03:49:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.