Related papers: MC3: Memory Contention based Covert Channel Communication on Shared DRAM System-on-Chips

MC3: Memory Contention based Covert Channel Communication on Shared DRAM System-on-Chips

URL: http://arxiv.org/abs/2412.05228v2
Date: Sat, 08 Feb 2025 21:44:24 GMT
Title: MC3: Memory Contention based Covert Channel Communication on Shared DRAM System-on-Chips
Authors: Ismet Dagli, James Crea, Soner Seckiner, Yuanchao Xu, Selçuk Köse, Mehmet E. Belviranli,
Abstract summary: We introduce a new memory-contention based covert communication attack, MC3.<n>It achieves high throughput communication between applications running on CPU and GPU without the need for an LLC or elevated access to the system.<n>We demonstrate the utility of MC3 on NVIDIA Orin AGX, Orin NX, and Orin Nano up to a transmit rate of 6.4 kbps with less than 1% error rate.
Score: 1.369703525384921
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Shared-memory system-on-chips (SM-SoC) are ubiquitously employed by a wide-range of mobile computing platforms, including edge/IoT devices, autonomous systems and smartphones. In SM-SoCs, system-wide shared physical memory enables a convenient and financially-feasible way to make data accessible by dozens of processing units (PUs), such as CPU cores and domain specific accelerators. In this study, we investigate vulnerabilities that stem from the shared use of physical memory in such systems. Due to the diverse computational characteristics of the PUs they embed, SM-SoCs often do not employ a shared last level cache (LLC). While the literature proposes covert channel attacks for shared memory systems, high-throughput communication is currently possible by either relying on an LLC or privileged/physical access to the shared memory subsystem. In this study, we introduce a new memory-contention based covert communication attack, MC3, which specifically targets the shared system memory in mobile SoCs. Different from existing attacks, our approach achieves high throughput communication between applications running on CPU and GPU without the need for an LLC or elevated access to the system. We extensively explore the effectiveness of our methodology by demonstrating the trade-off between the channel transmission rate and the robustness of the communication. We demonstrate the utility of MC3 on NVIDIA Orin AGX, Orin NX, and Orin Nano up to a transmit rate of 6.4 kbps with less than 1% error rate.

Related papers

A Universal Framework for Compressing Embeddings in CTR Prediction [68.27582084015044]
We introduce a Model-agnostic Embedding Compression (MEC) framework that compresses embedding tables by quantizing pre-trained embeddings. Our approach consists of two stages: first, we apply popularity-weighted regularization to balance code distribution between high- and low-frequency features. Experiments on three datasets reveal that our method reduces memory usage by over 50x while maintaining or improving recommendation performance.
arXiv Detail & Related papers (2025-02-21T10:12:34Z)
A Memory-Based Reinforcement Learning Approach to Integrated Sensing and Communication [52.40430937325323]
We consider a point-to-point integrated sensing and communication (ISAC) system, where a transmitter conveys a message to a receiver over a channel with memory.<n>We formulate the capacity-distortion tradeoff for the ISAC problem when sensing is performed in an online fashion.
arXiv Detail & Related papers (2024-12-02T03:30:50Z)
Stochastic Communication Avoidance for Recommendation Systems [27.616664288148232]
We propose a theoretical framework that analyses the communication costs of arbitrary distributed systems that use lookup tables. We use this framework to propose algorithms that maximize throughput subject to memory, computation, and communication constraints. We implement our framework and algorithms in PyTorch and achieve up to 6x increases in training throughput on GPU systems over baselines.
arXiv Detail & Related papers (2024-11-03T15:37:37Z)
MeMoir: A Software-Driven Covert Channel based on Memory Usage [7.424928818440549]
MeMoir is a novel software-driven covert channel that, for the first time, utilizes memory usage as the medium for the channel. We implement a machine learning-based detector that can predict whether an attack is present in the system with an accuracy of more than 95%. We introduce a noise-based countermeasure that effectively mitigates the attack while inducing a low power overhead in the system.
arXiv Detail & Related papers (2024-09-20T08:10:36Z)
vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving [53.972175896814505]
Large Language Models (LLMs) are widely used across various domains, processing millions of daily requests. Large Language Models (LLMs) are widely used across various domains, processing millions of daily requests.
arXiv Detail & Related papers (2024-07-22T14:37:58Z)
Amplifying Main Memory-Based Timing Covert and Side Channels using Processing-in-Memory Operations [6.709670986126109]
We show that processing-in-memory (PiM) solutions provide a new way to directly access main memory, which malicious user applications can exploit. We introduce IMPACT, a set of high- throughput main memory-based timing attacks that leverage characteristics of PiM architectures to establish covert and side channels. Our results demonstrate that our covert channels achieve 12.87 Mb/s and 14.16 Mb/s communication throughput, respectively, which is up to 4.91x and 5.41x faster than the state-of-the-art main memory-based covert channels.
arXiv Detail & Related papers (2024-04-17T11:48:14Z)
Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks. By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead. We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z)
FusionAI: Decentralized Training and Deploying LLMs with Massive Consumer-Level GPUs [57.12856172329322]
We envision a decentralized system unlocking the potential vast untapped consumer-level GPU. This system faces critical challenges, including limited CPU and GPU memory, low network bandwidth, the variability of peer and device heterogeneity.
arXiv Detail & Related papers (2023-09-03T13:27:56Z)
Shared Memory-contention-aware Concurrent DNN Execution for Diversely Heterogeneous System-on-Chips [0.32634122554914]
HaX-CoNN is a novel scheme that characterizes and maps layers in concurrently executing inference workloads. We evaluate HaX-CoNN on NVIDIA Orin, NVIDIA Xavier, and Qualcomm Snapdragon 865 SOCs.
arXiv Detail & Related papers (2023-08-10T22:47:40Z)
Asynchronous Parallel Incremental Block-Coordinate Descent for Decentralized Machine Learning [55.198301429316125]
Machine learning (ML) is a key technique for big-data-driven modelling and analysis of massive Internet of Things (IoT) based intelligent and ubiquitous computing. For fast-increasing applications and data amounts, distributed learning is a promising emerging paradigm since it is often impractical or inefficient to share/aggregate data. This paper studies the problem of training an ML model over decentralized systems, where data are distributed over many user devices.
arXiv Detail & Related papers (2022-02-07T15:04:15Z)
Memory-Guided Semantic Learning Network for Temporal Sentence Grounding [55.31041933103645]
We propose a memory-augmented network that learns and memorizes the rarely appeared content in TSG tasks. MGSL-Net consists of three main parts: a cross-modal inter-action module, a memory augmentation module, and a heterogeneous attention module.
arXiv Detail & Related papers (2022-01-03T02:32:06Z)
Reconfigurable Low-latency Memory System for Sparse Matricized Tensor Times Khatri-Rao Product on FPGA [3.4870723728779565]
Sparse Matricized Times Khatri-Rao Product (MTTKRP) is one of the most expensive kernels in tensor computations. This paper focuses on a multi-faceted memory system, which explores the spatial and temporal locality of the data structures of MTTKRP. Our system shows 2x and 1.26x speedups compared with cache-only and DMA-only memory systems, respectively.
arXiv Detail & Related papers (2021-09-18T08:19:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.