Related papers: SALSA PICANTE: a machine learning attack on LWE with binary secrets

SALSA PICANTE: a machine learning attack on LWE with binary secrets

URL: http://arxiv.org/abs/2303.04178v4
Date: Tue, 31 Oct 2023 16:41:26 GMT
Title: SALSA PICANTE: a machine learning attack on LWE with binary secrets
Authors: Cathy Li, Jana Sot\'akov\'a, Emily Wenger, Mohamed Malhou, Evrard Garcelon, Francois Charton, Kristin Lauter
Abstract summary: We present PICANTE, an enhanced machine learning attack on LWE with sparse binary secrets. PICANTE recovers secrets in much larger dimensions (up to $n=350$) and with larger Hamming weights. While PICANTE does not threaten NIST's proposed LWE standards, it demonstrates significant improvement over SALSA.
Score: 8.219373043653507
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning with Errors (LWE) is a hard math problem underpinning many proposed post-quantum cryptographic (PQC) systems. The only PQC Key Exchange Mechanism (KEM) standardized by NIST is based on module~LWE, and current publicly available PQ Homomorphic Encryption (HE) libraries are based on ring LWE. The security of LWE-based PQ cryptosystems is critical, but certain implementation choices could weaken them. One such choice is sparse binary secrets, desirable for PQ HE schemes for efficiency reasons. Prior work, SALSA, demonstrated a machine learning-based attack on LWE with sparse binary secrets in small dimensions ($n \le 128$) and low Hamming weights ($h \le 4$). However, this attack assumes access to millions of eavesdropped LWE samples and fails at higher Hamming weights or dimensions. We present PICANTE, an enhanced machine learning attack on LWE with sparse binary secrets, which recovers secrets in much larger dimensions (up to $n=350$) and with larger Hamming weights (roughly $n/10$, and up to $h=60$ for $n=350$). We achieve this dramatic improvement via a novel preprocessing step, which allows us to generate training data from a linear number of eavesdropped LWE samples ($4n$) and changes the distribution of the data to improve transformer training. We also improve the secret recovery methods of SALSA and introduce a novel cross-attention recovery mechanism allowing us to read off the secret directly from the trained models. While PICANTE does not threaten NIST's proposed LWE standards, it demonstrates significant improvement over SALSA and could scale further, highlighting the need for future investigation into machine learning attacks on LWE with sparse binary secrets.

Related papers

DAPO: An Open-Source LLM Reinforcement Learning System at Scale [63.24798333145823]
We open-source a state-of-the-art large-scale RL system that achieves 50 points on AIME 2024 using Qwen2.5-32B base model. In addition, we open-source our training code, which is built on the verl framework, along with a carefully curated and processed dataset.
arXiv Detail & Related papers (2025-03-18T17:49:06Z)
Dynamic Low-Rank Sparse Adaptation for Large Language Models [54.1231638555233]
Low-rank Sparse Adaptation (LoSA) is a novel method that seamlessly integrates low-rank adaptation into sparse LLM sparsity. LoSA dynamically sparsifies the LoRA outcomes based on the corresponding sparse weights during fine-tuning. LoSA can efficiently boost the efficacy of sparse LLMs within a few hours, without introducing any additional inferential burden.
arXiv Detail & Related papers (2025-02-20T18:37:32Z)
HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models [92.85175340702125]
We distill a large teacher safety guard model into a smaller one using a labeled dataset of instruction-response pairs with binary harmfulness labels. We propose HarmAug, a simple yet effective data augmentation method that involves jailbreaking an LLM and prompting it to generate harmful instructions. Our HarmAug achieves an F1 score comparable to larger models with over 7 billion parameters, and even outperforms them in AUPRC, while operating at less than 25% of their computational cost.
arXiv Detail & Related papers (2024-10-02T13:12:13Z)
HexaCoder: Secure Code Generation via Oracle-Guided Synthetic Training Data [60.75578581719921]
Large language models (LLMs) have shown great potential for automatic code generation. Recent studies highlight that many LLM-generated code contains serious security vulnerabilities. We introduce HexaCoder, a novel approach to enhance the ability of LLMs to generate secure codes.
arXiv Detail & Related papers (2024-09-10T12:01:43Z)
Benchmarking Attacks on Learning with Errors [9.031051362571436]
Lattice cryptography schemes based on the learning with errors (LWE) hardness assumption have been standardized by NIST for use as post-quantum cryptosystems. We provide the first benchmarks for LWE secret recovery on standardized parameters, for small and low-weight (sparse) secrets. We extend the SALSA and Cool & Cruel attacks in significant ways, and implement and scale up MitM attacks for the first time.
arXiv Detail & Related papers (2024-08-01T19:21:20Z)
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series [86.31735321970481]
We open-source MAP-Neo, a bilingual language model with 7B parameters trained from scratch on 4.5T high-quality tokens. Our MAP-Neo is the first fully open-sourced bilingual LLM with comparable performance compared to existing state-of-the-art LLMs.
arXiv Detail & Related papers (2024-05-29T17:57:16Z)
The cool and the cruel: separating hard parts of LWE secrets [11.000531626756853]
Known attacks on sparse binary LWE secrets include the sparse dual attack and the hybrid sparse dual-meet in the middle attack. In this paper, we provide a new statistical attack with low memory requirement.
arXiv Detail & Related papers (2024-03-15T14:16:21Z)
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs [53.31402059062365]
BiLLM is a groundbreaking 1-bit post-training quantization scheme tailored for pretrained large language models. It achieves for the first time high-accuracy inference (e.g. 8.41 perplexity on LLaMA2-70B) with only 1.08-bit weights across various LLMs families.
arXiv Detail & Related papers (2024-02-06T09:26:34Z)
Salsa Fresca: Angular Embeddings and Pre-Training for ML Attacks on Learning With Errors [10.800552110718714]
Learning with Errors (LWE) is a hard math problem underlying post-quantum cryptography systems for key exchange and digital signatures. Prior work proposed new machine learning (ML)-based attacks on LWE problems with small, sparse secrets, but these attacks require millions of LWE samples to train on and take days to recover secrets. We propose three key methods -- better preprocessing, angular embeddings and model pre-training -- to improve these attacks.
arXiv Detail & Related papers (2024-02-02T00:48:27Z)
Bypassing the Safety Training of Open-Source LLMs with Priming Attacks [3.8023902618391783]
In this paper, we investigate the fragility of SOTA open-source LLMs under simple, optimization-free attacks. Our proposed attack improves the Attack Success Rate on Harmful Behaviors, as measured by Llama Guard, by up to $3.3times$ compared to baselines.
arXiv Detail & Related papers (2023-12-19T16:47:12Z)
Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs [67.38165028487242]
We introduce Dynamic Sparse No Training (DSnoT), a training-free fine-tuning approach to fine-tune large language models (LLMs) Inspired by the Dynamic Sparse Training, DSnoT minimizes the reconstruction error between the dense and sparse LLMs. Our paper offers fresh insights into how to fine-tune sparse LLMs in an efficient training-free manner and open new venues to scale the great potential of sparsity to LLMs.
arXiv Detail & Related papers (2023-10-13T07:38:52Z)
SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks [99.23352758320945]
We propose SmoothLLM, the first algorithm designed to mitigate jailbreaking attacks on large language models (LLMs) Based on our finding that adversarially-generated prompts are brittle to character-level changes, our defense first randomly perturbs multiple copies of a given input prompt, and then aggregates the corresponding predictions to detect adversarial inputs.
arXiv Detail & Related papers (2023-10-05T17:01:53Z)
SALSA: Attacking Lattice Cryptography with Transformers [6.229340901386596]
We propose SALSA: a machine learning attack on LWE-based cryptographic schemes. SALSA can fully recover secrets for small-to-mid size LWE instances with sparse binary secrets, and may scale to attack real-world LWE-based cryptosystems.
arXiv Detail & Related papers (2022-07-11T11:35:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.