Related papers: Enabling Low-Cost Secure Computing on Untrusted In-Memory Architectures

Enabling Low-Cost Secure Computing on Untrusted In-Memory Architectures

URL: http://arxiv.org/abs/2501.17292v1
Date: Tue, 28 Jan 2025 20:48:14 GMT
Title: Enabling Low-Cost Secure Computing on Untrusted In-Memory Architectures
Authors: Sahar Ghoflsaz Ghinani, Jingyao Zhang, Elaheh Sadredini,
Abstract summary: Processing-in-Memory (PIM) promises to substantially improve performance by moving processing closer to the data.<n>Integrating PIM modules within a secure computing system raises an interesting challenge: unencrypted data has to move off-chip to the PIM, exposing the data to attackers and breaking assumptions on Trusted Computing Bases (TCBs)<n>This paperleverages multi-party computation (MPC) techniques, specifically arithmetic secret sharing and Yao's garbled circuits, to outsource bandwidth-intensive computation securely to PIM.
Score: 5.565715369147691
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modern computing systems are limited in performance by the memory bandwidth available to processors, a problem known as the memory wall. Processing-in-Memory (PIM) promises to substantially improve this problem by moving processing closer to the data, improving effective data bandwidth, and leading to superior performance on memory-intensive workloads. However, integrating PIM modules within a secure computing system raises an interesting challenge: unencrypted data has to move off-chip to the PIM, exposing the data to attackers and breaking assumptions on Trusted Computing Bases (TCBs). To tackle this challenge, this paper leverages multi-party computation (MPC) techniques, specifically arithmetic secret sharing and Yao's garbled circuits, to outsource bandwidth-intensive computation securely to PIM. Additionally, we leverage precomputation optimization to prevent the CPU's portion of the MPC from becoming a bottleneck. We evaluate our approach using the UPMEM PIM system over various applications such as Deep Learning Recommendation Model inference and Logistic Regression. Our evaluations demonstrate up to a $14.66\times$ speedup compared to a secure CPU configuration while maintaining data confidentiality and integrity when outsourcing linear and/or nonlinear computation.

Related papers

MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents [84.62985963113245]
We introduce MEM1, an end-to-end reinforcement learning framework that enables agents to operate with constant memory across long multi-turn tasks.<n>At each turn, MEM1 updates a compact shared internal state that jointly supports memory consolidation and reasoning.<n>We show that MEM1-7B improves performance by 3.5x while reducing memory usage by 3.7x compared to Qwen2.5-14B-Instruct on a 16-objective multi-hop QA task.
arXiv Detail & Related papers (2025-06-18T19:44:46Z)
Evaluating the Potential of In-Memory Processing to Accelerate Homomorphic Encryption [1.5707609236065612]
homomorphic encryption (HE) allows computation without the need for decryption.<n>The high computational and memory overhead associated with the underlying cryptographic operations has hindered the practicality of HE-based solutions.<n> processing in-memory (PIM) presents a promising solution to this problem by bringing computation closer to data, thereby reducing the overhead resulting from processor-memory data movements.
arXiv Detail & Related papers (2024-12-12T10:28:58Z)
Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design [59.00758127310582]
We propose a novel framework Read-ME that transforms pre-trained dense LLMs into smaller MoE models. Our approach employs activation sparsity to extract experts. Read-ME outperforms other popular open-source dense models of similar scales.
arXiv Detail & Related papers (2024-10-24T19:48:51Z)
PhD Forum: Efficient Privacy-Preserving Processing via Memory-Centric Computing [0.0]
Homomorphic encryption (HE) and secure multi-party computation (SMPC) enhance data security by enabling processing on encrypted data. Existing approaches focus on improving computational overhead using specialized hardware. We propose a framework that uses recently available PIM hardware to achieve efficient privacy-preserving computation.
arXiv Detail & Related papers (2024-09-25T09:37:50Z)
Low-Latency Privacy-Preserving Deep Learning Design via Secure MPC [31.35072624488929]
Secure multi-party computation (MPC) facilitates privacy-preserving computation between multiple parties without leaking private information. This work proposes a low-latency secret-sharing-based MPC design that reduces unnecessary communication rounds during the execution of MPC protocols.
arXiv Detail & Related papers (2024-07-24T07:01:21Z)
TensorTEE: Unifying Heterogeneous TEE Granularity for Efficient Secure Collaborative Tensor Computing [13.983627699836376]
Existing heterogeneous TEE designs are inefficient for collaborative computing due to fine and different memory granularities between CPU and NPU. We propose a unified tensor-granularity heterogeneous TEE for efficient secure collaborative computing. The results show that the TEE improves the performance of Large Language Model (LLM) training workloads by 4.0x compared to existing work.
arXiv Detail & Related papers (2024-07-12T00:35:18Z)
Efficient and accurate neural field reconstruction using resistive memory [52.68088466453264]
Traditional signal reconstruction methods on digital computers face both software and hardware challenges. We propose a systematic approach with software-hardware co-optimizations for signal reconstruction from sparse inputs. This work advances the AI-driven signal restoration technology and paves the way for future efficient and robust medical AI and 3D vision applications.
arXiv Detail & Related papers (2024-04-15T09:33:09Z)
PIM-Opt: Demystifying Distributed Optimization Algorithms on a Real-World Processing-In-Memory System [21.09681871279162]
Modern Machine Learning (ML) training on large-scale datasets is a time-consuming workload. It relies on the optimization algorithm Gradient Descent (SGD) due to its effectiveness, simplicity, and generalization performance. processor-centric architectures suffer from low performance and high energy consumption while executing ML training workloads. Processing-In-Memory (PIM) is a promising solution to alleviate the data movement bottleneck.
arXiv Detail & Related papers (2024-04-10T17:00:04Z)
ProactivePIM: Accelerating Weight-Sharing Embedding Layer with PIM for Scalable Recommendation System [16.2798383044926]
Weight-sharing algorithms have been proposed for size reduction, but they increase memory access. Recent advancements in processing-in-memory (PIM) enhanced the model throughput by exploiting memory parallelism. We propose ProactivePIM, a PIM system for weight-sharing recommendation system acceleration.
arXiv Detail & Related papers (2024-02-06T14:26:22Z)
A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs) MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z)
Differentially Private Deep Q-Learning for Pattern Privacy Preservation in MEC Offloading [76.0572817182483]
attackers may eavesdrop on the offloading decisions to infer the edge server's (ES's) queue information and users' usage patterns. We propose an offloading strategy which jointly minimizes the latency, ES's energy consumption, and task dropping rate, while preserving pattern privacy (PP) We develop a Differential Privacy Deep Q-learning based Offloading (DP-DQO) algorithm to solve this problem while addressing the PP issue by injecting noise into the generated offloading decisions.
arXiv Detail & Related papers (2023-02-09T12:50:18Z)
Distributed Reinforcement Learning for Privacy-Preserving Dynamic Edge Caching [91.50631418179331]
A privacy-preserving distributed deep policy gradient (P2D3PG) is proposed to maximize the cache hit rates of devices in the MEC networks. We convert the distributed optimizations into model-free Markov decision process problems and then introduce a privacy-preserving federated learning method for popularity prediction.
arXiv Detail & Related papers (2021-10-20T02:48:27Z)
JUMBO: Scalable Multi-task Bayesian Optimization using Offline Data [86.8949732640035]
We propose JUMBO, an MBO algorithm that sidesteps limitations by querying additional data. We show that it achieves no-regret under conditions analogous to GP-UCB. Empirically, we demonstrate significant performance improvements over existing approaches on two real-world optimization problems.
arXiv Detail & Related papers (2021-06-02T05:03:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.