Related papers: OpenGL GPU-Based Rowhammer Attack (Work in Progress)

OpenGL GPU-Based Rowhammer Attack (Work in Progress)

URL: http://arxiv.org/abs/2509.19959v1
Date: Wed, 24 Sep 2025 10:11:05 GMT
Title: OpenGL GPU-Based Rowhammer Attack (Work in Progress)
Authors: Antoine Plin, Frédéric Fauberteau, Nga Nguyen,
Abstract summary: This paper presents an adaptive, many-sided Rowhammer attack utilizing GPU compute shaders.<n>Our approach employs statistical distributions to optimize row targeting and avoid current mitigations.<n> Experimental results on a Raspberry Pi 4 demonstrate that the GPU-based approach attains a high rate of bit flips compared to traditional CPU-based hammering.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Rowhammer attacks have emerged as a significant threat to modern DRAM-based memory systems, leveraging frequent memory accesses to induce bit flips in adjacent memory cells. This work-in-progress paper presents an adaptive, many-sided Rowhammer attack utilizing GPU compute shaders to systematically achieve high-frequency memory access patterns. Our approach employs statistical distributions to optimize row targeting and avoid current mitigations. The methodology involves initializing memory with known patterns, iteratively hammering victim rows, monitoring for induced errors, and dynamically adjusting parameters to maximize success rates. The proposed attack exploits the parallel processing capabilities of GPUs to accelerate hammering operations, thereby increasing the probability of successful bit flips within a constrained timeframe. By leveraging OpenGL compute shaders, our implementation achieves highly efficient row hammering with minimal software overhead. Experimental results on a Raspberry Pi 4 demonstrate that the GPU-based approach attains a high rate of bit flips compared to traditional CPU-based hammering, confirming its effectiveness in compromising DRAM integrity. Our findings align with existing research on microarchitectural attacks in heterogeneous systems that highlight the susceptibility of GPUs to security vulnerabilities. This study contributes to the understanding of GPU-assisted fault-injection attacks and underscores the need for improved mitigation strategies in future memory architectures.

Related papers

From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents [78.30630000529133]
We propose MM-Mem, a pyramidal multimodal memory architecture grounded in Fuzzy-Trace Theory.<n> MM-Mem memory structures hierarchically into a Sensory Buffer, Episodic Stream, and Symbolic.<n>Experiments confirm the effectiveness of MM-Mem on both offline and streaming tasks.
arXiv Detail & Related papers (2026-03-02T05:12:45Z)
GPU-Accelerated Algorithms for Graph Vector Search: Taxonomy, Empirical Study, and Research Directions [54.570944939061555]
We present a comprehensive study of GPU-accelerated graph-based vector search algorithms.<n>We establish a detailed taxonomy of GPU optimization strategies and clarify the mapping between algorithmic tasks and hardware execution units.<n>Our findings offer clear guidelines for designing scalable and robust GPU-powered approximate nearest neighbor search systems.
arXiv Detail & Related papers (2026-02-10T16:18:04Z)
Eliminating Multi-GPU Performance Taxes: A Systems Approach to Efficient Distributed LLMs [61.953548065938385]
We introduce the ''Three Taxes'' (Bulk Synchronous, Inter- Kernel Data Locality, and Kernel Launch Overhead) as an analytical framework.<n>We propose moving beyond the rigid BSP model to address key inefficiencies in distributed GPU execution.<n>We observe a 10-20% speedup in end-to-end latency over BSP-based approaches.
arXiv Detail & Related papers (2025-11-04T01:15:44Z)
xMem: A CPU-Based Approach for Accurate Estimation of GPU Memory in Deep Learning Training Workloads [2.2991119948183525]
estimation of how much GPU memory a job will require is fundamental to enabling advanced scheduling and GPU sharing.<n>We propose xMem, a novel framework that leverages CPU-only dynamic analysis to accurately estimate peak GPU memory requirements.<n>The analysis of 5209 runs, which includes ANOVA and Monte Carlo results, highlights xMem's benefits.
arXiv Detail & Related papers (2025-10-23T23:16:27Z)
$ρ$Hammer: Reviving RowHammer Attacks on New Architectures via Prefetching [37.49955872834092]
Rowhammer is a critical vulnerability in dynamic random access memory (DRAM)<n>We present $rho$Hammer, a new Rowhammer framework that overcomes three core challenges impeding attacks on new architectures.<n>$rho$Hammer induces up to 200K+ additional bit flips within 2-hour attack pattern fuzzing processes and has a 112x higher flip rate than the load-based hammering baselines.
arXiv Detail & Related papers (2025-10-18T15:40:53Z)
ShadowScope: GPU Monitoring and Validation via Composable Side Channel Signals [6.389108369952326]
GPU kernels are vulnerable to both traditional memory safety issues and emerging microarchitectural threats.<n>We propose ShadowScope, a monitoring and validation framework that leverages a composable golden model.<n>We also introduce ShadowScope+, a hardware-assisted validation mechanism that integrates lightweight on-chip checks into the GPU pipeline.
arXiv Detail & Related papers (2025-08-30T01:38:05Z)
GPUHammer: Rowhammer Attacks on GPU Memories are Practical [3.3625059118072107]
We demonstrate the first successful Rowhammer attack on a discrete GPU.<n>We show how an attacker can use these to tamper with ML models, causing significant accuracy drops (up to 80%)<n>We also show how an attacker can use these to tamper with ML models, causing significant accuracy drops (up to 80%)
arXiv Detail & Related papers (2025-07-10T20:57:47Z)
Scaling Probabilistic Circuits via Monarch Matrices [109.65822339230853]
Probabilistic Circuits (PCs) are tractable representations of probability distributions.<n>We propose a novel sparse and structured parameterization for the sum blocks in PCs.
arXiv Detail & Related papers (2025-06-14T07:39:15Z)
Robustness of deep learning classification to adversarial input on GPUs: asynchronous parallel accumulation is a source of vulnerability [4.054484966653432]
A key measure of machine learning (ML) classification models' safety and reliability is their ability to resist small, targeted input perturbations.<n>We show that floating-point non-associativity coupled with asynchronous parallel programming on GPU is sufficient to result in misclassification.<n>We also show that standard adversarial robustness results may be overestimated up to 4.6 when not considering machine-level details.
arXiv Detail & Related papers (2025-03-21T14:19:45Z)
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss [59.835032408496545]
We propose a tile-based strategy that partitions the contrastive loss calculation into arbitrary small blocks. We also introduce a multi-level tiling strategy to leverage the hierarchical structure of distributed systems. Compared to SOTA memory-efficient solutions, it achieves a two-order-of-magnitude reduction in memory while maintaining comparable speed.
arXiv Detail & Related papers (2024-10-22T17:59:30Z)
Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection [50.7263393517558]
We introduce Holographic Global Convolutional Networks (HGConv) that utilize the properties of Holographic Reduced Representations (HRR) Unlike other global convolutional methods, our method does not require any intricate kernel computation or crafted kernel design. The proposed method has achieved new SOTA results on Microsoft Malware Classification Challenge, Drebin, and EMBER malware benchmarks.
arXiv Detail & Related papers (2024-03-23T15:49:13Z)
Overload: Latency Attacks on Object Detection for Edge Devices [47.9744734181236]
This paper investigates latency attacks on deep learning applications. Unlike common adversarial attacks for misclassification, the goal of latency attacks is to increase the inference time. We use object detection to demonstrate how such kind of attacks work.
arXiv Detail & Related papers (2023-04-11T17:24:31Z)
EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens [57.354304637367555]
We present EVEREST, a surprisingly efficient MVA approach for video representation learning. It finds tokens containing rich motion features and discards uninformative ones during both pre-training and fine-tuning. Our method significantly reduces the computation and memory requirements of MVA.
arXiv Detail & Related papers (2022-11-19T09:57:01Z)
GPU-Accelerated Primal Learning for Extremely Fast Large-Scale Classification [10.66048003460524]
One of the most efficient methods to solve L2-regularized primal problems, such as logistic regression and linear support vector machine (SVM) classification, is the widely used trust region Newton algorithm, TRON. We show that using judicious GPU-optimization principles, TRON training time for different losses and feature representations may be drastically reduced.
arXiv Detail & Related papers (2020-08-08T03:40:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.