Related papers: Benchmarking Large Language Models for Cryptanalysis and Mismatched-Generalization

Benchmarking Large Language Models for Cryptanalysis and Mismatched-Generalization

URL: http://arxiv.org/abs/2505.24621v1
Date: Fri, 30 May 2025 14:12:07 GMT
Title: Benchmarking Large Language Models for Cryptanalysis and Mismatched-Generalization
Authors: Utsav Maskey, Chencheng Zhu, Usman Naseem,
Abstract summary: Large Language Models (LLMs) have transformed natural language understanding and generation.<n> cryptanalysis a critical area for data security and encryption has not yet been thoroughly explored in LLM evaluations.<n>We evaluate cryptanalytic potential of state of the art LLMs on encrypted texts generated using a range of cryptographic algorithms.
Score: 4.020376901658977
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advancements in Large Language Models (LLMs) have transformed natural language understanding and generation, leading to extensive benchmarking across diverse tasks. However, cryptanalysis a critical area for data security and encryption has not yet been thoroughly explored in LLM evaluations. To address this gap, we evaluate cryptanalytic potential of state of the art LLMs on encrypted texts generated using a range of cryptographic algorithms. We introduce a novel benchmark dataset comprising diverse plain texts spanning various domains, lengths, writing styles, and topics paired with their encrypted versions. Using zero-shot and few shot settings, we assess multiple LLMs for decryption accuracy and semantic comprehension across different encryption schemes. Our findings reveal key insights into the strengths and limitations of LLMs in side-channel communication while raising concerns about their susceptibility to jailbreaking attacks. This research highlights the dual-use nature of LLMs in security contexts and contributes to the ongoing discussion on AI safety and security.

Related papers

AICrypto: A Comprehensive Benchmark For Evaluating Cryptography Capabilities of Large Language Models [18.222665072623755]
Large language models (LLMs) have demonstrated remarkable capabilities across a variety of domains.<n>Our benchmark comprises 135 multiple-choice questions, 150 capture-the-flag (CTF) challenges, and 18 proof problems.<n>Our evaluation of 17 leading LLMs reveals that state-of-the-art models match or even surpass human experts in memorizing cryptographic concepts.
arXiv Detail & Related papers (2025-07-13T11:11:01Z)
MetaCipher: A General and Extensible Reinforcement Learning Framework for Obfuscation-Based Jailbreak Attacks on Black-Box LLMs [14.530593083777502]
obfuscation-based jailbreak attacks remain highly effective.<n>We propose textbfMetaCipher, a novel obfuscation-based jailbreak framework.<n>Within as few as 10 queries, MetaCipher achieves over 92% attack success rate.
arXiv Detail & Related papers (2025-06-27T18:15:56Z)
CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges [42.16123880046729]
We introduce CipherBank, a benchmark designed to evaluate the reasoning capabilities of large language models (LLMs) in cryptographic decryption tasks.<n>We evaluate state-of-the-art LLMs on CipherBank, e.g., GPT-4o, DeepSeek-V3, and cutting-edge reasoning-focused models such as o1 and DeepSeek-R1.
arXiv Detail & Related papers (2025-04-27T03:41:17Z)
Idiosyncrasies in Large Language Models [54.26923012617675]
We unveil and study idiosyncrasies in Large Language Models (LLMs)<n>We find that fine-tuning text embedding models on LLM-generated texts yields excellent classification accuracy.<n>We leverage LLM as judges to generate detailed, open-ended descriptions of each model's idiosyncrasies.
arXiv Detail & Related papers (2025-02-17T18:59:02Z)
Cryptanalysis via Machine Learning Based Information Theoretic Metrics [58.96805474751668]
We propose two novel applications of machine learning (ML) algorithms to perform cryptanalysis on any cryptosystem.<n>These algorithms can be readily applied in an audit setting to evaluate the robustness of a cryptosystem.<n>We show that our classification model correctly identifies the encryption schemes that are not IND-CPA secure, such as DES, RSA, and AES ECB, with high accuracy.
arXiv Detail & Related papers (2025-01-25T04:53:36Z)
What You See Is Not Always What You Get: An Empirical Study of Code Comprehension by Large Language Models [0.5735035463793009]
We investigate the vulnerability of large language models (LLMs) to imperceptible attacks, where hidden character manipulation in source code misleads LLMs' behaviour while remaining undetectable to human reviewers.<n>These attacks include coding reordering, invisible coding characters, code deletions, and code homoglyphs.<n>Our findings confirm the susceptibility of LLMs to imperceptible coding character attacks, while different LLMs present different negative correlations between perturbation magnitude and performance.
arXiv Detail & Related papers (2024-12-11T04:52:41Z)
CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models [49.60006012946767]
We propose CodeChameleon, a novel jailbreak framework based on personalized encryption tactics. We conduct extensive experiments on 7 Large Language Models, achieving state-of-the-art average Attack Success Rate (ASR) Remarkably, our method achieves an 86.6% ASR on GPT-4-1106.
arXiv Detail & Related papers (2024-02-26T16:35:59Z)
When "Competency" in Reasoning Opens the Door to Vulnerability: Jailbreaking LLMs via Novel Complex Ciphers [33.41396323391102]
We introduce Attacks using Custom Encryptions (ACE), a jailbreaking technique that encodes malicious queries with novel ciphers.<n>We also introduce Layered Attacks using Custom Encryptions (LACE), which applies multi-layer ciphers to amplify attack complexity.<n>Our experiments reveal a critical trade-off: LLMs that are more capable of decoding ciphers are more vulnerable to these jailbreaking attacks.
arXiv Detail & Related papers (2024-02-16T11:37:05Z)
A Survey on Detection of LLMs-Generated Content [97.87912800179531]
The ability to detect LLMs-generated content has become of paramount importance. We aim to provide a detailed overview of existing detection strategies and benchmarks. We also posit the necessity for a multi-faceted approach to defend against various attacks.
arXiv Detail & Related papers (2023-10-24T09:10:26Z)
GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher [85.18213923151717]
Experimental results show certain ciphers succeed almost 100% of the time to bypass the safety alignment of GPT-4 in several safety domains. We propose a novel SelfCipher that uses only role play and several demonstrations in natural language to evoke this capability.
arXiv Detail & Related papers (2023-08-12T04:05:57Z)
Towards Codable Watermarking for Injecting Multi-bits Information to LLMs [86.86436777626959]
Large language models (LLMs) generate texts with increasing fluency and realism. Existing watermarking methods are encoding-inefficient and cannot flexibly meet the diverse information encoding needs. We propose Codable Text Watermarking for LLMs (CTWL) that allows text watermarks to carry multi-bit customizable information.
arXiv Detail & Related papers (2023-07-29T14:11:15Z)
Red Teaming Language Model Detectors with Language Models [114.36392560711022]
Large language models (LLMs) present significant safety and ethical risks if exploited by malicious users. Recent works have proposed algorithms to detect LLM-generated text and protect LLMs. We study two types of attack strategies: 1) replacing certain words in an LLM's output with their synonyms given the context; 2) automatically searching for an instructional prompt to alter the writing style of the generation.
arXiv Detail & Related papers (2023-05-31T10:08:37Z)
Memorization for Good: Encryption with Autoregressive Language Models [8.645826579841692]
We propose the first symmetric encryption algorithm with autoregressive language models (SELM) We show that autoregressive LMs can encode arbitrary data into a compact real-valued vector (i.e., encryption) and then losslessly decode the vector to the original message (i.e. decryption) via random subspace optimization and greedy decoding.
arXiv Detail & Related papers (2023-05-15T05:42:34Z)
TEDL: A Text Encryption Method Based on Deep Learning [10.428079716944463]
This paper proposes a novel text encryption method based on deep learning called TEDL. Results of experiments and relevant analyses show that TEDL performs well for security, efficiency, generality, and has a lower demand for the frequency of key redistribution.
arXiv Detail & Related papers (2020-03-09T11:04:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.