CryptoTensors: A Light-Weight Large Language Model File Format for Highly-Secure Model Distribution
- URL: http://arxiv.org/abs/2512.04580v2
- Date: Mon, 08 Dec 2025 08:00:19 GMT
- Title: CryptoTensors: A Light-Weight Large Language Model File Format for Highly-Secure Model Distribution
- Authors: Huifeng Zhu, Shijie Li, Qinfeng Li, Yier Jin,
- Abstract summary: We introduce CryptoTensors, a secure and format-compatible file structure for confidential LLM distribution.<n>Built as an extension to the widely adopted Safetensors format, CryptoTensors incorporates tensor-level encryption and embedded access control policies.<n>Our results highlight CryptoTensors as a light-weight, efficient, and developer-friendly solution for safeguarding LLM weights in real-world and widespread deployments.
- Score: 16.430668737524346
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To enhance the performance of large language models (LLMs) in various domain-specific applications, sensitive data such as healthcare, law, and finance are being used to privately customize or fine-tune these models. Such privately adapted LLMs are regarded as either personal privacy assets or corporate intellectual property. Therefore, protecting model weights and maintaining strict confidentiality during deployment and distribution have become critically important. However, existing model formats and deployment frameworks provide little to no built-in support for confidentiality, access control, or secure integration with trusted hardware. Current methods for securing model deployment either rely on computationally expensive cryptographic techniques or tightly controlled private infrastructure. Although these approaches can be effective in specific scenarios, they are difficult and costly for widespread deployment. In this paper, we introduce CryptoTensors, a secure and format-compatible file structure for confidential LLM distribution. Built as an extension to the widely adopted Safetensors format, CryptoTensors incorporates tensor-level encryption and embedded access control policies, while preserving critical features such as lazy loading and partial deserialization. It enables transparent decryption and automated key management, supporting flexible licensing and secure model execution with minimal overhead. We implement a proof-of-concept library, benchmark its performance across serialization and runtime scenarios, and validate its compatibility with existing inference frameworks, including Hugging Face Transformers and vLLM. Our results highlight CryptoTensors as a light-weight, efficient, and developer-friendly solution for safeguarding LLM weights in real-world and widespread deployments.
Related papers
- A Privacy-Preserving Cloud Architecture for Distributed Machine Learning at Scale [0.0]
This work introduces a cloud-native privacy-preserving architecture that integrates federated learning, differential privacy, zero- knowledge compliance proofs, and adaptive governance powered by reinforcement learning.<n>The framework supports secure model training and inference without centralizing sensitive data, while enabling cryptographically verifiable policy enforcement across institutions and cloud platforms.
arXiv Detail & Related papers (2025-12-11T06:46:46Z) - Patching LLM Like Software: A Lightweight Method for Improving Safety Policy in Large Language Models [63.54707418559388]
We propose patching for large language models (LLMs) like software versions.<n>Our method enables rapid remediation by prepending a compact, learnable prefix to an existing model.
arXiv Detail & Related papers (2025-11-11T17:25:44Z) - Reimagining Safety Alignment with An Image [49.33281424100804]
Large language models (LLMs) excel in diverse applications but face dual challenges: generating harmful content under jailbreak attacks and over-refusal of benign queries.<n>We propose Magic Image, an optimization-driven visual prompt framework that enhances security while reducing over-refusal.
arXiv Detail & Related papers (2025-11-01T11:27:07Z) - Design and Optimization of Cloud Native Homomorphic Encryption Workflows for Privacy-Preserving ML Inference [0.0]
Homomorphic Encryption (HE) has emerged as a compelling technique that enables cryptographic computation on encrypted data.<n>The integration of HE within large scale cloud native pipelines remains constrained by high computational overhead, orchestration complexity, and model compatibility issues.<n>This paper presents a systematic framework for the design and optimization of cloud native homomorphic encryption that support privacy ML inference.
arXiv Detail & Related papers (2025-10-28T15:13:32Z) - OpenGuardrails: A Configurable, Unified, and Scalable Guardrails Platform for Large Language Models [3.3252656373741547]
We present OpenGuardrails, the first fully open-source platform that unifies large-model-based safety detection, manipulation defense, and deployable guardrail infrastructure.<n>OpenGuardrails protects against three major classes of risks: (1) content-safety violations such as harmful or explicit text generation, (2) model-manipulation attacks including prompt injection, jailbreaks, and code-interpreter abuse, and (3) data leakage involving sensitive or private information.
arXiv Detail & Related papers (2025-10-22T02:02:27Z) - Speculative Safety-Aware Decoding [46.78651034593231]
We introduce Speculative Safety-Aware Decoding (SSD), a lightweight decoding-time approach that equips LLMs with the desired safety property while accelerating inference.<n>SSD integrates speculative sampling during decoding and leverages the match ratio between the small and composite models to quantify jailbreak risks.<n> Experimental results show that SSD successfully equips the large model with the desired safety property, and also allows the model to remain helpful to benign queries.
arXiv Detail & Related papers (2025-08-25T07:30:10Z) - Secure Tug-of-War (SecTOW): Iterative Defense-Attack Training with Reinforcement Learning for Multimodal Model Security [63.41350337821108]
We propose Secure Tug-of-War (SecTOW) to enhance the security of multimodal large language models (MLLMs)<n>SecTOW consists of two modules: a defender and an auxiliary attacker, both trained iteratively using reinforcement learning (GRPO)<n>We show that SecTOW significantly improves security while preserving general performance.
arXiv Detail & Related papers (2025-07-29T17:39:48Z) - FedShield-LLM: A Secure and Scalable Federated Fine-Tuned Large Language Model [0.48342038441006796]
Federated Learning (FL) offers a decentralized framework for training and fine-tuning Large Language Models (LLMs)<n>FL addresses privacy and security concerns while navigating challenges associated with the substantial computational demands of LLMs.<n>We propose a novel method, FedShield-LLM, that uses pruning with Fully Homomorphic Encryption (FHE) for Low-Rank Adaptation (LoRA) parameters.
arXiv Detail & Related papers (2025-06-06T00:05:05Z) - PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts [59.5243730853157]
Large language models (LLMs) hosted on cloud servers alleviate the computational and storage burdens on local devices but raise privacy concerns.<n>Small language models (SLMs) running locally enhance privacy but suffer from limited performance on complex tasks.<n>We propose a privacy-aware wireless collaborative mixture of experts (PWC-MoE) framework to balance computational cost, performance, and privacy protection under bandwidth constraints.
arXiv Detail & Related papers (2025-05-13T16:27:07Z) - Encrypted Large Model Inference: The Equivariant Encryption Paradigm [18.547945807599543]
We introduce Equivariant Encryption (EE), a novel paradigm designed to enable secure, "blind" inference on encrypted data with near zero performance overhead.<n>Unlike fully homomorphic approaches that encrypt the entire computational graph, EE selectively obfuscates critical internal representations within neural network layers.<n>EE maintains high fidelity and throughput, effectively bridging the gap between robust data confidentiality and the stringent efficiency requirements of modern, large scale model inference.
arXiv Detail & Related papers (2025-02-03T03:05:20Z) - A Middle Path for On-Premises LLM Deployment: Preserving Privacy Without Sacrificing Model Confidentiality [20.646221081945523]
Privacy-sensitive users require deploying large language models (LLMs) within their own infrastructure (on-premises) to safeguard private data and enable customization.<n>Previous research on small models has explored securing only the output layer within hardware-secured devices to balance model confidentiality and customization.<n>We propose SOLID, a novel deployment framework that secures a few bottom layers in a secure environment and introduces an efficient metric to optimize the trade-off.
arXiv Detail & Related papers (2024-10-15T02:00:36Z) - Fine-Tuning and Deploying Large Language Models Over Edges: Issues and Approaches [64.42735183056062]
Large language models (LLMs) have evolved from specialized deep models to versatile foundation models.<n>LLMs require fine-tuning on local datasets and substantial memory for deployment over the network edges.<n>LLMs have been expanded beyond text generation to create images, audio, video, and multi-modal content.<n>Model fine-tuning and model-compression techniques have been developed to support the sustainable growth of LLMs.
arXiv Detail & Related papers (2024-08-20T09:42:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.