NVBleed: Covert and Side-Channel Attacks on NVIDIA Multi-GPU Interconnect
- URL: http://arxiv.org/abs/2503.17847v1
- Date: Sat, 22 Mar 2025 19:52:02 GMT
- Title: NVBleed: Covert and Side-Channel Attacks on NVIDIA Multi-GPU Interconnect
- Authors: Yicheng Zhang, Ravan Nazaraliyev, Sankha Baran Dutta, Andres Marquez, Kevin Barker, Nael Abu-Ghazaleh,
- Abstract summary: We explore whether the interconnect on such systems can offer a novel source of leakage, enabling new forms of covert and side-channel attacks.<n>We develop two end-to-end crossGPU side-channel attacks, including application fingerprinting and 3D graphics character identification within Blender.<n>We also discover that leakage surprisingly occurs across Virtual Machines on the Google Cloud Platform.
- Score: 4.573191891034322
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-GPU systems are becoming increasingly important in highperformance computing (HPC) and cloud infrastructure, providing acceleration for data-intensive applications, including machine learning workloads. These systems consist of multiple GPUs interconnected through high-speed networking links such as NVIDIA's NVLink. In this work, we explore whether the interconnect on such systems can offer a novel source of leakage, enabling new forms of covert and side-channel attacks. Specifically, we reverse engineer the operations of NVlink and identify two primary sources of leakage: timing variations due to contention and accessible performance counters that disclose communication patterns. The leakage is visible remotely and even across VM instances in the cloud, enabling potentially dangerous attacks. Building on these observations, we develop two types of covert-channel attacks across two GPUs, achieving a bandwidth of over 70 Kbps with an error rate of 4.78% for the contention channel. We develop two end-to-end crossGPU side-channel attacks: application fingerprinting (including 18 high-performance computing and deep learning applications) and 3D graphics character identification within Blender, a multi-GPU rendering application. These attacks are highly effective, achieving F1 scores of up to 97.78% and 91.56%, respectively. We also discover that leakage surprisingly occurs across Virtual Machines on the Google Cloud Platform (GCP) and demonstrate a side-channel attack on Blender, achieving F1 scores exceeding 88%. We also explore potential defenses such as managing access to counters and reducing the resolution of the clock to mitigate the two sources of leakage.
Related papers
- MeMoir: A Software-Driven Covert Channel based on Memory Usage [7.424928818440549]
MeMoir is a novel software-driven covert channel that, for the first time, utilizes memory usage as the medium for the channel.
We implement a machine learning-based detector that can predict whether an attack is present in the system with an accuracy of more than 95%.
We introduce a noise-based countermeasure that effectively mitigates the attack while inducing a low power overhead in the system.
arXiv Detail & Related papers (2024-09-20T08:10:36Z) - Dynamic Frequency-Based Fingerprinting Attacks against Modern Sandbox Environments [7.753621963239778]
We investigate the possibility of fingerprinting containers through CPU frequency reporting sensors in Intel and AMD CPUs.
We demonstrate that Docker images exhibit a unique frequency signature, enabling the distinction of different containers with up to 84.5% accuracy.
Our empirical results show that these attacks can also be carried out successfully against all of these sandboxes in less than 40 seconds.
arXiv Detail & Related papers (2024-04-16T16:45:47Z) - Beyond the Bridge: Contention-Based Covert and Side Channel Attacks on Multi-GPU Interconnect [4.573191891034322]
This study highlights the vulnerability of multi-GPU systems to covert and side channel attacks due to congestion on interconnects.
An adversary can infer private information about a victim's activities by monitoring NVLink congestion without needing special permissions.
arXiv Detail & Related papers (2024-04-05T03:57:38Z) - Whispering Pixels: Exploiting Uninitialized Register Accesses in Modern GPUs [6.1255640691846285]
We showcase the existence of a vulnerability on products of 3 major vendors - Apple, NVIDIA and Qualcomm.
This vulnerability poses unique challenges to an adversary due to opaque scheduling and register remapping algorithms.
We implement information leakage attacks on intermediate data of Convolutional Neural Networks (CNNs) and present the attack's capability to leak and reconstruct the output of Large Language Models (LLMs)
arXiv Detail & Related papers (2024-01-16T23:36:48Z) - FusionAI: Decentralized Training and Deploying LLMs with Massive
Consumer-Level GPUs [57.12856172329322]
We envision a decentralized system unlocking the potential vast untapped consumer-level GPU.
This system faces critical challenges, including limited CPU and GPU memory, low network bandwidth, the variability of peer and device heterogeneity.
arXiv Detail & Related papers (2023-09-03T13:27:56Z) - Communication-Efficient Graph Neural Networks with Probabilistic
Neighborhood Expansion Analysis and Caching [59.8522166385372]
Training and inference with graph neural networks (GNNs) on massive graphs has been actively studied since the inception of GNNs.
This paper is concerned with minibatch training and inference with GNNs that employ node-wise sampling in distributed settings.
We present SALIENT++, which extends the prior state-of-the-art SALIENT system to work with partitioned feature data.
arXiv Detail & Related papers (2023-05-04T21:04:01Z) - EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens [57.354304637367555]
We present EVEREST, a surprisingly efficient MVA approach for video representation learning.
It finds tokens containing rich motion features and discards uninformative ones during both pre-training and fine-tuning.
Our method significantly reduces the computation and memory requirements of MVA.
arXiv Detail & Related papers (2022-11-19T09:57:01Z) - Instant Neural Graphics Primitives with a Multiresolution Hash Encoding [67.33850633281803]
We present a versatile new input encoding that permits the use of a smaller network without sacrificing quality.
A small neural network is augmented by a multiresolution hash table of trainable feature vectors whose values are optimized through a gradient descent.
We achieve a combined speed of several orders of magnitude, enabling training of high-quality neural graphics primitives in a matter of seconds.
arXiv Detail & Related papers (2022-01-16T07:22:47Z) - Project CGX: Scalable Deep Learning on Commodity GPUs [17.116792714097738]
This paper investigates whether hardware overprovisioning can be supplanted via algorithmic and system design.
We propose a framework called CGX, which provides efficient software support for communication compression.
We show that this framework is able to remove communication bottlenecks from consumer-grade multi-GPU systems.
arXiv Detail & Related papers (2021-11-16T17:00:42Z) - DistGNN: Scalable Distributed Training for Large-Scale Graph Neural
Networks [58.48833325238537]
Full-batch training on Graph Neural Networks (GNN) to learn the structure of large graphs is a critical problem that needs to scale to hundreds of compute nodes to be feasible.
In this paper, we presentGNN that optimize the well-known Deep Graph Library (DGL) for full-batch training on CPU clusters.
Our results on four common GNN benchmark datasets show up to 3.7x speed-up using a single CPU socket and up to 97x speed-up using 128 CPU sockets.
arXiv Detail & Related papers (2021-04-14T08:46:35Z) - Faster than FAST: GPU-Accelerated Frontend for High-Speed VIO [46.20949184826173]
This work focuses on the applicability of efficient low-level, GPU hardware-specific instructions to improve on existing computer vision algorithms.
Especially non-maxima suppression and the subsequent feature selection are prominent contributors to the overall image processing latency.
arXiv Detail & Related papers (2020-03-30T14:16:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.