Related papers: WebGPU-SPY: Finding Fingerprints in the Sandbox through GPU Cache Attacks

WebGPU-SPY: Finding Fingerprints in the Sandbox through GPU Cache Attacks

URL: http://arxiv.org/abs/2401.04349v1
Date: Tue, 9 Jan 2024 04:21:43 GMT
Title: WebGPU-SPY: Finding Fingerprints in the Sandbox through GPU Cache Attacks
Authors: Ethan Ferguson, Adam Wilson, Hoda Naghibijouybari,
Abstract summary: We present a new attack vector for microarchitectural attacks in web browsers. We develop a cache side channel attack on the compute stack of the GPU that spies on victim activities. We demonstrate that GPU-based cache attacks can achieve a precision of 90 for website fingerprinting of 100 top websites.
Score: 0.7400926717561453
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Microarchitectural attacks on CPU structures have been studied in native applications, as well as in web browsers. These attacks continue to be a substantial threat to computing systems at all scales. With the proliferation of heterogeneous systems and integration of hardware accelerators in every computing system, modern web browsers provide the support of GPU-based acceleration for the graphics and rendering processes. Emerging web standards also support the GPU acceleration of general-purpose computation within web browsers. In this paper, we present a new attack vector for microarchitectural attacks in web browsers. We use emerging GPU accelerating APIs in modern browsers (specifically WebGPU) to launch a GPU-based cache side channel attack on the compute stack of the GPU that spies on victim activities on the graphics (rendering) stack of the GPU. Unlike prior works that rely on JavaScript APIs or software interfaces to build timing primitives, we build the timer using GPU hardware resources and develop a cache side channel attack on Intel's integrated GPUs. We leverage the GPU's inherent parallelism at different levels to develop high-resolution parallel attacks. We demonstrate that GPU-based cache attacks can achieve a precision of 90 for website fingerprinting of 100 top websites. We also discuss potential countermeasures against the proposed attack to secure the systems at a critical time when these web standards are being developed and before they are widely deployed.

Related papers

Crypto Miner Attack: GPU Remote Code Execution Attacks [0.0]
Remote Code Execution (RCE) exploits pose a significant threat to AI and ML systems. This paper focuses on RCE attacks leveraging deserialization vulnerabilities and custom layers, such as Lambda layers. We demonstrate an attack that utilizes these vulnerabilities to deploy a crypto miner on a GPU.
arXiv Detail & Related papers (2025-02-09T19:26:47Z)
Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading [2.8231000588510757]
Transformers and large language models(LLMs) have seen rapid adoption in all domains. Training of transformers is very expensive and often hits a memory wall'' We propose a novel technique to split the LLM into subgroups, whose update phase is scheduled on either the CPU or the GPU.
arXiv Detail & Related papers (2024-10-26T00:43:59Z)
DarthShader: Fuzzing WebGPU Shader Translators & Compilers [19.345967816562364]
A recent trend towards running more demanding web applications has led to the adoption of the WebGPU standard. This opens up a new attack surface: Untrusted web content is passed through to the GPU stack, which traditionally has been optimized for performance instead of security. DarthShader is the first language fuzzer that combines mutators based on an intermediate representation with those using a more traditional abstract syntax tree.
arXiv Detail & Related papers (2024-09-03T12:06:19Z)
Whispering Pixels: Exploiting Uninitialized Register Accesses in Modern GPUs [6.1255640691846285]
We showcase the existence of a vulnerability on products of 3 major vendors - Apple, NVIDIA and Qualcomm. This vulnerability poses unique challenges to an adversary due to opaque scheduling and register remapping algorithms. We implement information leakage attacks on intermediate data of Convolutional Neural Networks (CNNs) and present the attack's capability to leak and reconstruct the output of Large Language Models (LLMs)
arXiv Detail & Related papers (2024-01-16T23:36:48Z)
FusionAI: Decentralized Training and Deploying LLMs with Massive Consumer-Level GPUs [57.12856172329322]
We envision a decentralized system unlocking the potential vast untapped consumer-level GPU. This system faces critical challenges, including limited CPU and GPU memory, low network bandwidth, the variability of peer and device heterogeneity.
arXiv Detail & Related papers (2023-09-03T13:27:56Z)
A Study on the Intersection of GPU Utilization and CNN Inference [8.084016058894779]
We show that there is room to improve the inference-time GPU utilization of convolutional neural network (CNN) inference. Our study makes the case that there is room to improve the inference-time GPU utilization of CNNs and that knowledge of GPU utilization has the potential to benefit even applications that do not target utilization itself.
arXiv Detail & Related papers (2022-12-15T16:11:40Z)
PLSSVM: A (multi-)GPGPU-accelerated Least Squares Support Vector Machine [68.8204255655161]
Support Vector Machines (SVMs) are widely used in machine learning. However, even modern and optimized implementations do not scale well for large non-trivial dense data sets on cutting-edge hardware. PLSSVM can be used as a drop-in replacement for an LVM.
arXiv Detail & Related papers (2022-02-25T13:24:23Z)
Instant Neural Graphics Primitives with a Multiresolution Hash Encoding [67.33850633281803]
We present a versatile new input encoding that permits the use of a smaller network without sacrificing quality. A small neural network is augmented by a multiresolution hash table of trainable feature vectors whose values are optimized through a gradient descent. We achieve a combined speed of several orders of magnitude, enabling training of high-quality neural graphics primitives in a matter of seconds.
arXiv Detail & Related papers (2022-01-16T07:22:47Z)
RTGPU: Real-Time GPU Scheduling of Hard Deadline Parallel Tasks with Fine-Grain Utilization [5.02836935036198]
We propose RTGPU, which can schedule the execution of multiple GPU applications in real-time to meet hard deadlines. Our approach provides superior schedulability compared with previous work, and gives real-time guarantees to meet hard deadlines for multiple GPU applications.
arXiv Detail & Related papers (2021-01-25T22:34:06Z)
Adversarial Attack on Large Scale Graph [58.741365277995044]
Recent studies have shown that graph neural networks (GNNs) are vulnerable against perturbations due to lack of robustness. Currently, most works on attacking GNNs are mainly using gradient information to guide the attack and achieve outstanding performance. We argue that the main reason is that they have to use the whole graph for attacks, resulting in the increasing time and space complexity as the data scale grows. We present a practical metric named Degree Assortativity Change (DAC) to measure the impacts of adversarial attacks on graph data.
arXiv Detail & Related papers (2020-09-08T02:17:55Z)
Faster than FAST: GPU-Accelerated Frontend for High-Speed VIO [46.20949184826173]
This work focuses on the applicability of efficient low-level, GPU hardware-specific instructions to improve on existing computer vision algorithms. Especially non-maxima suppression and the subsequent feature selection are prominent contributors to the overall image processing latency.
arXiv Detail & Related papers (2020-03-30T14:16:23Z)
Efficient Video Semantic Segmentation with Labels Propagation and Refinement [138.55845680523908]
This paper tackles the problem of real-time semantic segmentation of high definition videos using a hybrid GPU / CPU approach. We propose an Efficient Video(EVS) pipeline that combines: (i) On the CPU, a very fast optical flow method, that is used to exploit the temporal aspect of the video and propagate semantic information from one frame to the next. On the popular Cityscapes dataset with high resolution frames (2048 x 1024), the proposed operating points range from 80 to 1000 Hz on a single GPU and CPU.
arXiv Detail & Related papers (2019-12-26T11:45:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.