ObCLIP: Oblivious CLoud-Device Hybrid Image Generation with Privacy Preservation
- URL: http://arxiv.org/abs/2510.04153v1
- Date: Sun, 05 Oct 2025 11:09:10 GMT
- Title: ObCLIP: Oblivious CLoud-Device Hybrid Image Generation with Privacy Preservation
- Authors: Haoqi Wu, Wei Dai, Ming Xu, Li Wang, Qiang Yan,
- Abstract summary: ObCLIP is a plug-and-play safeguard for oblivious cloud-device hybrid generation.<n>It provides rigorous privacy and comparable utility to cloud models with slightly increased server cost.
- Score: 9.081441952478306
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Diffusion Models have gained significant popularity due to their remarkable capabilities in image generation, albeit at the cost of intensive computation requirement. Meanwhile, despite their widespread deployment in inference services such as Midjourney, concerns about the potential leakage of sensitive information in uploaded user prompts have arisen. Existing solutions either lack rigorous privacy guarantees or fail to strike an effective balance between utility and efficiency. To bridge this gap, we propose ObCLIP, a plug-and-play safeguard that enables oblivious cloud-device hybrid generation. By oblivious, each input prompt is transformed into a set of semantically similar candidate prompts that differ only in sensitive attributes (e.g., gender, ethnicity). The cloud server processes all candidate prompts without knowing which one is the real one, thus preventing any prompt leakage. To mitigate server cost, only a small portion of denoising steps is performed upon the large cloud model. The intermediate latents are then sent back to the client, which selects the targeted latent and completes the remaining denoising using a small device model. Additionally, we analyze and incorporate several cache-based accelerations that leverage temporal and batch redundancy, effectively reducing computation cost with minimal utility degradation. Extensive experiments across multiple datasets demonstrate that ObCLIP provides rigorous privacy and comparable utility to cloud models with slightly increased server cost.
Related papers
- Your Inference Request Will Become a Black Box: Confidential Inference for Cloud-based Large Language Models [39.390624817461905]
Talaria is a confidential inference framework that partitions the Large Language Models pipeline to protect client data.<n>Talaria executes sensitive, weight-independent operations within a client-controlled Confidential Virtual Machine.<n>Talaria can defend against state-of-the-art token inference attacks, reducing token reconstruction accuracy from over 97.5% to an average of 1.34%.
arXiv Detail & Related papers (2026-02-27T06:37:07Z) - A Distributed Framework for Privacy-Enhanced Vision Transformers on the Edge [3.344634520578015]
We propose a distributed, hierarchical offloading framework for Vision Transformers (ViTs)<n>Our approach uses a local trusted edge device, such as a mobile phone or an Nvidia Jetson, as the edge orchestrator.<n>By design, no single external server possesses the complete image, preventing comprehensive data reconstruction.
arXiv Detail & Related papers (2025-12-10T04:37:07Z) - Repulsor: Accelerating Generative Modeling with a Contrastive Memory Bank [65.00301565190824]
mname is a plug-and-play training framework that requires no external encoders.<n>mname achieves a state-of-the-art FID of textbf2.40 within 400k steps, significantly outperforming comparable methods.
arXiv Detail & Related papers (2025-12-09T14:39:26Z) - Practical and Private Hybrid ML Inference with Fully Homomorphic Encryption [0.34953784594970894]
Safhire is a hybrid inference framework that executes linear layers under encryption on the server.<n>It supports exact activations and significantly reduces computation.<n>It achieves 1.5X - 10.5X lower inference latency than Orion.
arXiv Detail & Related papers (2025-09-01T08:43:46Z) - EC-Diff: Fast and High-Quality Edge-Cloud Collaborative Inference for Diffusion Models [57.059991285047296]
hybrid edge-cloud collaborative framework was recently proposed to realize fast inference and high-quality generation.<n>Excessive cloud denoising prolongs inference time, while insufficient steps cause semantic ambiguity, leading to inconsistency in edge model output.<n>We propose EC-Diff that accelerates cloud inference through gradient-based noise estimation.<n>Our method significantly enhances generation quality compared to edge inference, while achieving up to an average $2times$ speedup in inference compared to cloud inference.
arXiv Detail & Related papers (2025-07-16T07:23:14Z) - Fast and Cost-effective Speculative Edge-Cloud Decoding with Early Exits [11.398891065175686]
Large Language Models (LLMs) enable various applications on edge devices such as smartphones, wearables, and embodied robots.<n>LLMs can be deployed on-device, offering a cost-effective solution with reduced latency and improved privacy.<n>We propose a fast and cost-effective speculative edge-cloud decoding framework with a large target model on the server and a small draft model on the device.
arXiv Detail & Related papers (2025-05-27T14:55:16Z) - Task-Oriented Feature Compression for Multimodal Understanding via Device-Edge Co-Inference [49.77734021302196]
We propose a task-oriented feature compression (TOFC) method for multimodal understanding in a device-edge co-inference framework.<n>To enhance compression efficiency, multiple entropy models are adaptively selected based on the characteristics of the visual features.<n>Results show that TOFC achieves up to 52% reduction in data transmission overhead and 63% reduction in system latency.
arXiv Detail & Related papers (2025-03-17T08:37:22Z) - Towards Resource-Efficient Federated Learning in Industrial IoT for Multivariate Time Series Analysis [50.18156030818883]
Anomaly and missing data constitute a thorny problem in industrial applications.
Deep learning enabled anomaly detection has emerged as a critical direction.
The data collected in edge devices contain user privacy.
arXiv Detail & Related papers (2024-11-06T15:38:31Z) - $\Lambda$-Split: A Privacy-Preserving Split Computing Framework for
Cloud-Powered Generative AI [3.363904632882723]
We introduce $Lambda$-Split, a split computing framework to facilitate computational offloading.
In $Lambda$-Split, a generative model, usually a deep neural network (DNN), is partitioned into three sub-models and distributed across the user's local device and a cloud server.
This architecture ensures that only the hidden layer outputs are transmitted, thereby preventing the external transmission of privacy-sensitive raw input and output data.
arXiv Detail & Related papers (2023-10-23T07:44:04Z) - Mobile-Cloud Inference for Collaborative Intelligence [3.04585143845864]
There is an increasing need for faster execution and lower energy consumption for deep learning model inference.
Historically, the models run on mobile devices have been smaller and simpler in comparison to large state-of-the-art research models, which can only run on the cloud.
Cloud-only inference has drawbacks such as increased network bandwidth consumption and higher latency.
There is an alternative approach: shared mobile-cloud inference.
arXiv Detail & Related papers (2023-06-24T14:22:53Z) - Over-the-Air Federated Learning with Privacy Protection via Correlated
Additive Perturbations [57.20885629270732]
We consider privacy aspects of wireless federated learning with Over-the-Air (OtA) transmission of gradient updates from multiple users/agents to an edge server.
Traditional perturbation-based methods provide privacy protection while sacrificing the training accuracy.
In this work, we aim at minimizing privacy leakage to the adversary and the degradation of model accuracy at the edge server.
arXiv Detail & Related papers (2022-10-05T13:13:35Z) - DUET: A Tuning-Free Device-Cloud Collaborative Parameters Generation Framework for Efficient Device Model Generalization [66.27399823422665]
Device Model Generalization (DMG) is a practical yet under-investigated research topic for on-device machine learning applications.<n>We propose an efficient Device-cloUd collaborative parametErs generaTion framework DUET.
arXiv Detail & Related papers (2022-09-12T13:26:26Z) - Shared Mobile-Cloud Inference for Collaborative Intelligence [35.103437828235826]
We present a shared mobile-cloud inference approach for neural model inference.
The strategy can improve inference latency, energy consumption, and network bandwidth usage.
Further performance gain can be achieved by compressing the feature tensor before its transmission.
arXiv Detail & Related papers (2020-02-01T07:12:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.