SALSA PICANTE: a machine learning attack on LWE with binary secrets
- URL: http://arxiv.org/abs/2303.04178v4
- Date: Tue, 31 Oct 2023 16:41:26 GMT
- Title: SALSA PICANTE: a machine learning attack on LWE with binary secrets
- Authors: Cathy Li, Jana Sot\'akov\'a, Emily Wenger, Mohamed Malhou, Evrard
Garcelon, Francois Charton, Kristin Lauter
- Abstract summary: We present PICANTE, an enhanced machine learning attack on LWE with sparse binary secrets.
PICANTE recovers secrets in much larger dimensions (up to $n=350$) and with larger Hamming weights.
While PICANTE does not threaten NIST's proposed LWE standards, it demonstrates significant improvement over SALSA.
- Score: 8.219373043653507
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning with Errors (LWE) is a hard math problem underpinning many proposed
post-quantum cryptographic (PQC) systems. The only PQC Key Exchange Mechanism
(KEM) standardized by NIST is based on module~LWE, and current publicly
available PQ Homomorphic Encryption (HE) libraries are based on ring LWE. The
security of LWE-based PQ cryptosystems is critical, but certain implementation
choices could weaken them. One such choice is sparse binary secrets, desirable
for PQ HE schemes for efficiency reasons. Prior work, SALSA, demonstrated a
machine learning-based attack on LWE with sparse binary secrets in small
dimensions ($n \le 128$) and low Hamming weights ($h \le 4$). However, this
attack assumes access to millions of eavesdropped LWE samples and fails at
higher Hamming weights or dimensions.
We present PICANTE, an enhanced machine learning attack on LWE with sparse
binary secrets, which recovers secrets in much larger dimensions (up to
$n=350$) and with larger Hamming weights (roughly $n/10$, and up to $h=60$ for
$n=350$). We achieve this dramatic improvement via a novel preprocessing step,
which allows us to generate training data from a linear number of eavesdropped
LWE samples ($4n$) and changes the distribution of the data to improve
transformer training. We also improve the secret recovery methods of SALSA and
introduce a novel cross-attention recovery mechanism allowing us to read off
the secret directly from the trained models. While PICANTE does not threaten
NIST's proposed LWE standards, it demonstrates significant improvement over
SALSA and could scale further, highlighting the need for future investigation
into machine learning attacks on LWE with sparse binary secrets.
Related papers
- HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models [92.85175340702125]
We distill a large teacher safety guard model into a smaller one using a labeled dataset of instruction-response pairs with binary harmfulness labels.
We propose HarmAug, a simple yet effective data augmentation method that involves jailbreaking an LLM and prompting it to generate harmful instructions.
Our HarmAug achieves an F1 score comparable to larger models with over 7 billion parameters, and even outperforms them in AUPRC, while operating at less than 25% of their computational cost.
arXiv Detail & Related papers (2024-10-02T13:12:13Z) - HexaCoder: Secure Code Generation via Oracle-Guided Synthetic Training Data [60.75578581719921]
Large language models (LLMs) have shown great potential for automatic code generation.
Recent studies highlight that many LLM-generated code contains serious security vulnerabilities.
We introduce HexaCoder, a novel approach to enhance the ability of LLMs to generate secure codes.
arXiv Detail & Related papers (2024-09-10T12:01:43Z) - Benchmarking Attacks on Learning with Errors [9.031051362571436]
Lattice cryptography schemes based on the learning with errors (LWE) hardness assumption have been standardized by NIST for use as post-quantum cryptosystems.
We provide the first benchmarks for LWE secret recovery on standardized parameters, for small and low-weight (sparse) secrets.
We extend the SALSA and Cool & Cruel attacks in significant ways, and implement and scale up MitM attacks for the first time.
arXiv Detail & Related papers (2024-08-01T19:21:20Z) - MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series [86.31735321970481]
We open-source MAP-Neo, a bilingual language model with 7B parameters trained from scratch on 4.5T high-quality tokens.
Our MAP-Neo is the first fully open-sourced bilingual LLM with comparable performance compared to existing state-of-the-art LLMs.
arXiv Detail & Related papers (2024-05-29T17:57:16Z) - The cool and the cruel: separating hard parts of LWE secrets [11.000531626756853]
Known attacks on sparse binary LWE secrets include the sparse dual attack and the hybrid sparse dual-meet in the middle attack.
In this paper, we provide a new statistical attack with low memory requirement.
arXiv Detail & Related papers (2024-03-15T14:16:21Z) - BiLLM: Pushing the Limit of Post-Training Quantization for LLMs [53.31402059062365]
BiLLM is a groundbreaking 1-bit post-training quantization scheme tailored for pretrained large language models.
It achieves for the first time high-accuracy inference (e.g. 8.41 perplexity on LLaMA2-70B) with only 1.08-bit weights across various LLMs families.
arXiv Detail & Related papers (2024-02-06T09:26:34Z) - Salsa Fresca: Angular Embeddings and Pre-Training for ML Attacks on
Learning With Errors [10.800552110718714]
Learning with Errors (LWE) is a hard math problem underlying post-quantum cryptography systems for key exchange and digital signatures.
Prior work proposed new machine learning (ML)-based attacks on LWE problems with small, sparse secrets, but these attacks require millions of LWE samples to train on and take days to recover secrets.
We propose three key methods -- better preprocessing, angular embeddings and model pre-training -- to improve these attacks.
arXiv Detail & Related papers (2024-02-02T00:48:27Z) - Bypassing the Safety Training of Open-Source LLMs with Priming Attacks [3.8023902618391783]
In this paper, we investigate the fragility of SOTA open-source LLMs under simple, optimization-free attacks.
Our proposed attack improves the Attack Success Rate on Harmful Behaviors, as measured by Llama Guard, by up to $3.3times$ compared to baselines.
arXiv Detail & Related papers (2023-12-19T16:47:12Z) - Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs [67.38165028487242]
We introduce Dynamic Sparse No Training (DSnoT), a training-free fine-tuning approach to fine-tune large language models (LLMs)
Inspired by the Dynamic Sparse Training, DSnoT minimizes the reconstruction error between the dense and sparse LLMs.
Our paper offers fresh insights into how to fine-tune sparse LLMs in an efficient training-free manner and open new venues to scale the great potential of sparsity to LLMs.
arXiv Detail & Related papers (2023-10-13T07:38:52Z) - SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks [99.23352758320945]
We propose SmoothLLM, the first algorithm designed to mitigate jailbreaking attacks on large language models (LLMs)
Based on our finding that adversarially-generated prompts are brittle to character-level changes, our defense first randomly perturbs multiple copies of a given input prompt, and then aggregates the corresponding predictions to detect adversarial inputs.
arXiv Detail & Related papers (2023-10-05T17:01:53Z) - SALSA: Attacking Lattice Cryptography with Transformers [6.229340901386596]
We propose SALSA: a machine learning attack on LWE-based cryptographic schemes.
SALSA can fully recover secrets for small-to-mid size LWE instances with sparse binary secrets, and may scale to attack real-world LWE-based cryptosystems.
arXiv Detail & Related papers (2022-07-11T11:35:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.