Toward provably private analytics and insights into GenAI use
- URL: http://arxiv.org/abs/2510.21684v1
- Date: Fri, 24 Oct 2025 17:40:12 GMT
- Title: Toward provably private analytics and insights into GenAI use
- Authors: Albert Cheu, Artem Lagzdin, Brett McLarnon, Daniel Ramage, Katharine Daly, Marco Gruteser, Peter Kairouz, Rakshita Tandon, Stanislav Chiknavaryan, Timon Van Overveldt, Zoe Gong,
- Abstract summary: We present a next-generation federated analytics system based on technologies like AMD SEV-SNP and Intel TDX.<n>In our system, devices encrypt and upload data, tagging it with a limited set of allowable server-side processing steps.<n>An open source, TEE-hosted key management service guarantees that the data is only accessible to those steps.
- Score: 12.545209220189113
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large-scale systems that compute analytics over a fleet of devices must achieve high privacy and security standards while also meeting data quality, usability, and resource efficiency expectations. We present a next-generation federated analytics system that uses Trusted Execution Environments (TEEs) based on technologies like AMD SEV-SNP and Intel TDX to provide verifiable privacy guarantees for all server-side processing. In our system, devices encrypt and upload data, tagging it with a limited set of allowable server-side processing steps. An open source, TEE-hosted key management service guarantees that the data is accessible only to those steps, which are themselves protected by TEE confidentiality and integrity assurance guarantees. The system is designed for flexible workloads, including processing unstructured data with LLMs (for structured summarization) before aggregation into differentially private insights (with automatic parameter tuning). The transparency properties of our system allow any external party to verify that all raw and derived data is processed in TEEs, protecting it from inspection by the system operator, and that differential privacy is applied to all released results. This system has been successfully deployed in production, providing helpful insights into real-world GenAI experiences.
Related papers
- Reliable LLM-Based Edge-Cloud-Expert Cascades for Telecom Knowledge Systems [54.916243942641444]
Large language models (LLMs) are emerging as key enablers of automation in domains such as telecommunications.<n>We study an edge-cloud-expert cascaded LLM-based knowledge system that supports decision-making through a question-and-answer pipeline.
arXiv Detail & Related papers (2025-12-23T03:10:09Z) - Adversary-Aware Private Inference over Wireless Channels [51.93574339176914]
AI-based sensing at wireless edge devices has the potential to significantly enhance Artificial Intelligence (AI) applications.<n>As sensitive personal data can be reconstructed by an adversary, transformation of the features are required to reduce the risk of privacy violations.<n>We propose a novel framework for privacy-preserving AI-based sensing, where devices apply transformations of extracted features before transmission to a model server.
arXiv Detail & Related papers (2025-10-23T13:02:14Z) - Blockchain Powered Edge Intelligence for U-Healthcare in Privacy Critical and Time Sensitive Environment [0.559239450391449]
We propose an autonomous computing model for privacy-critical and time-sensitive health applications.<n>The system supports continuous monitoring, real-time alert notifications, disease detection, and robust data processing and aggregation.<n>A secure access scheme is defined to manage both off-chain and on-chain data sharing and storage.
arXiv Detail & Related papers (2025-05-31T06:58:52Z) - Zero-Trust Foundation Models: A New Paradigm for Secure and Collaborative Artificial Intelligence for Internet of Things [61.43014629640404]
Zero-Trust Foundation Models (ZTFMs) embed zero-trust security principles into the lifecycle of foundation models (FMs) for Internet of Things (IoT) systems.<n>ZTFMs can enable secure, privacy-preserving AI across distributed, heterogeneous, and potentially adversarial IoT environments.
arXiv Detail & Related papers (2025-05-26T06:44:31Z) - Privacy-Preserving Federated Embedding Learning for Localized Retrieval-Augmented Generation [60.81109086640437]
We propose a novel framework called Federated Retrieval-Augmented Generation (FedE4RAG)<n>FedE4RAG facilitates collaborative training of client-side RAG retrieval models.<n>We apply homomorphic encryption within federated learning to safeguard model parameters.
arXiv Detail & Related papers (2025-04-27T04:26:02Z) - Building a Privacy Web with SPIDEr -- Secure Pipeline for Information De-Identification with End-to-End Encryption [3.8909411486426033]
SPIDEr is an end-to-end encrypted data de-identification pipeline.<n>It supports suppression, pseudonymisation, generalisation, and aggregation.<n>We present our design of the control flows for end-to-end secure execution of de-identification operations within a TEE.
arXiv Detail & Related papers (2024-12-12T12:24:12Z) - Balancing Confidentiality and Transparency for Blockchain-based Process-Aware Information Systems [43.253676241213626]
We propose an architecture for blockchain-based PAISs to preserve confidentiality and transparency.<n>Smart contracts enact, enforce and store public interactions, while attribute-based encryption techniques are adopted to specify access grants to confidential information.<n>We assess the security of our solution through a systematic threat model analysis and evaluate its practical feasibility.
arXiv Detail & Related papers (2024-12-07T20:18:36Z) - PAPAYA Federated Analytics Stack: Engineering Privacy, Scalability and Practicality [5.276674920508729]
Cross-device Federated Analytics (FA) is a distributed computation paradigm designed to answer analytics queries about and derive insights from data held locally on users' devices.<n>Despite FA's broad relevance, the applicability of existing FA systems is limited by compromised accuracy; lack of flexibility for data analytics; and an inability to scale effectively.<n>We describe our approach to combine privacy, scalability, and practicality to build and deploy a system that overcomes these limitations.
arXiv Detail & Related papers (2024-12-03T10:03:12Z) - Privacy-Preserving Verifiable Neural Network Inference Service [4.131956503199438]
We develop a privacy-preserving and verifiable CNN inference scheme that preserves privacy for client data samples.
vPIN achieves high efficiency in terms of proof size, while providing client data privacy guarantees and provable verifiability.
arXiv Detail & Related papers (2024-11-12T01:09:52Z) - Confidential Federated Computations [16.415880530250092]
Federated Learning and Analytics (FLA) have seen widespread adoption by technology platforms for processing sensitive on-device data.<n>FLA systems do not necessarily require anonymization mechanisms like differential privacy (DP)<n>This paper introduces a novel system architecture that leverages trusted execution environments (TEEs) and open-sourcing to ensure confidentiality of server-side computations.
arXiv Detail & Related papers (2024-04-16T17:47:27Z) - HasTEE+ : Confidential Cloud Computing and Analytics with Haskell [50.994023665559496]
Confidential computing enables the protection of confidential code and data in a co-tenanted cloud deployment using specialized hardware isolation units called Trusted Execution Environments (TEEs)
TEEs offer low-level C/C++-based toolchains that are susceptible to inherent memory safety vulnerabilities and lack language constructs to monitor explicit and implicit information-flow leaks.
We address the above with HasTEE+, a domain-specific language (cla) embedded in Haskell that enables programming TEEs in a high-level language with strong type-safety.
arXiv Detail & Related papers (2024-01-17T00:56:23Z) - Trustworthy AI Inference Systems: An Industry Research View [58.000323504158054]
We provide an industry research view for approaching the design, deployment, and operation of trustworthy AI inference systems.
We highlight opportunities and challenges in AI systems using trusted execution environments.
We outline areas of further development that require the global collective attention of industry, academia, and government researchers.
arXiv Detail & Related papers (2020-08-10T23:05:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.