Traveling Salesman-Based Token Ordering Improves Stability in Homomorphically Encrypted Language Models
- URL: http://arxiv.org/abs/2510.12343v1
- Date: Tue, 14 Oct 2025 09:56:50 GMT
- Title: Traveling Salesman-Based Token Ordering Improves Stability in Homomorphically Encrypted Language Models
- Authors: Donghwan Rho, Sieun Seo, Hyewon Sung, Chohong Min, Ernest K. Ryu,
- Abstract summary: Homomorphic encryption (HE) provides a principled solution by enabling computation directly on encrypted data.<n>The challenge of text generation, particularly next-token prediction, has received limited attention.<n>We propose a TSP-based token reordering strategy to address the difficulties of encrypted text generation.
- Score: 16.73757071734074
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As users increasingly interact with large language models (LLMs) using private information, secure and encrypted communication becomes essential. Homomorphic encryption (HE) provides a principled solution by enabling computation directly on encrypted data. Although prior work has explored aspects of running LLMs under HE, the challenge of text generation, particularly next-token prediction, has received limited attention and remains a key obstacle to practical encrypted interaction. In this work, we propose a TSP-based token reordering strategy to address the difficulties of encrypted text generation, together with a post-processing step that further reduces approximation error. Theoretical analysis and experimental results demonstrate that our method prevents collapse, improves coherence in generated text, and preserves data privacy throughout. Overall, our contributions advance the feasibility of practical and privacy-preserving LLM inference.
Related papers
- Continual Pretraining on Encrypted Synthetic Data for Privacy-Preserving LLMs [16.21971760443489]
We propose an entity-based framework that synthesizes encrypted training data to protect personally identifiable information (PII)<n>Our approach constructs a weighted entity graph to guide data synthesis and applies deterministic encryption to PII entities.<n>Our results on limited-scale datasets demonstrate that our pretrained models outperform base models and ensure PII security.
arXiv Detail & Related papers (2026-01-09T08:44:07Z) - Zero-Shot Privacy-Aware Text Rewriting via Iterative Tree Search [60.197239728279534]
Large language models (LLMs) in cloud-based services have raised significant privacy concerns.<n>Existing text anonymization and de-identification techniques, such as rule-based redaction and scrubbing, often struggle to balance privacy preservation with text naturalness and utility.<n>We propose a zero-shot, tree-search-based iterative sentence rewriting algorithm that systematically obfuscates or deletes private information while preserving coherence, relevance, and naturalness.
arXiv Detail & Related papers (2025-09-25T07:23:52Z) - Privacy-Aware In-Context Learning for Large Language Models [12.605629953620495]
Large language models (LLMs) raise privacy concerns due to potential exposure of sensitive information.<n>We introduce a novel private prediction framework for generating high-quality synthetic text with strong privacy guarantees.
arXiv Detail & Related papers (2025-09-17T01:50:32Z) - The Double-edged Sword of LLM-based Data Reconstruction: Understanding and Mitigating Contextual Vulnerability in Word-level Differential Privacy Text Sanitization [53.51921540246166]
We show that Language Large Models (LLMs) can exploit the contextual vulnerability of DP-sanitized texts.<n>Experiments uncover a double-edged sword effect of LLM reconstructions on privacy and utility.<n>We propose recommendations for using data reconstruction as a post-processing step.
arXiv Detail & Related papers (2025-08-26T12:22:45Z) - A Privacy-Centric Approach: Scalable and Secure Federated Learning Enabled by Hybrid Homomorphic Encryption [2.611778281107039]
Federated Learning (FL) enables collaborative model training without sharing raw data, making it a promising approach for privacy-sensitive domains.<n>Despite its potential, FL faces significant challenges, particularly in terms of communication overhead and data privacy.<n>In this work, we explore how Hybrid Homomorphic Encryption (HHE), a cryptographic protocol that combines symmetric encryption with HE, can be effectively integrated with FL to address both communication and privacy challenges.
arXiv Detail & Related papers (2025-07-20T07:46:53Z) - Efficient Implementation of Reinforcement Learning over Homomorphic Encryption [0.7673339435080445]
We classify control policy synthesis into model-based, simulator-driven, and data-driven approaches.<n>We examine their implementation over fully homomorphic encryption (FHE) for privacy enhancements.<n>Our work suggests the potential for secure and efficient cloud-based reinforcement learning.
arXiv Detail & Related papers (2025-04-12T20:34:26Z) - Encrypted Large Model Inference: The Equivariant Encryption Paradigm [18.547945807599543]
We introduce Equivariant Encryption (EE), a novel paradigm designed to enable secure, "blind" inference on encrypted data with near zero performance overhead.<n>Unlike fully homomorphic approaches that encrypt the entire computational graph, EE selectively obfuscates critical internal representations within neural network layers.<n>EE maintains high fidelity and throughput, effectively bridging the gap between robust data confidentiality and the stringent efficiency requirements of modern, large scale model inference.
arXiv Detail & Related papers (2025-02-03T03:05:20Z) - Encryption-Friendly LLM Architecture [11.386436468650016]
Homomorphic encryption (HE) is a cryptographic protocol supporting arithmetic computations in encrypted states.<n>We propose a modified HE-friendly transformer architecture with an emphasis on inference following personalized (private) fine-tuning.
arXiv Detail & Related papers (2024-10-03T13:48:35Z) - Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding [118.75567341513897]
Existing methods typically analyze target text in isolation or solely with non-member contexts.<n>We propose Con-ReCall, a novel approach that leverages the asymmetric distributional shifts induced by member and non-member contexts.
arXiv Detail & Related papers (2024-09-05T09:10:38Z) - InferDPT: Privacy-Preserving Inference for Black-box Large Language Model [66.07752875835506]
InferDPT is the first practical framework for the privacy-preserving Inference of black-box LLMs.<n>RANTEXT is a novel differential privacy mechanism integrated into the perturbation module of InferDPT.
arXiv Detail & Related papers (2023-10-18T18:00:11Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - Reinforcement Learning on Encrypted Data [58.39270571778521]
We present a preliminary, experimental study of how a DQN agent trained on encrypted states performs in environments with discrete and continuous state spaces.
Our results highlight that the agent is still capable of learning in small state spaces even in presence of non-deterministic encryption, but performance collapses in more complex environments.
arXiv Detail & Related papers (2021-09-16T21:59:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.