Related papers: BackCache: Mitigating Contention-Based Cache Timing Attacks by Hiding Cache Line Evictions

BackCache: Mitigating Contention-Based Cache Timing Attacks by Hiding Cache Line Evictions

URL: http://arxiv.org/abs/2304.10268v5
Date: Wed, 12 Jun 2024 02:07:44 GMT
Title: BackCache: Mitigating Contention-Based Cache Timing Attacks by Hiding Cache Line Evictions
Authors: Quancheng Wang, Xige Zhang, Han Wang, Yuzhe Gu, Ming Tang,
Abstract summary: L1 data cache attacks pose a significant privacy and confidentiality threat. BackCache always achieves cache hits instead of cache misses to mitigate contention-based cache timing attacks on the L1 data cache. BackCache places the evicted cache lines from the L1 data cache into a fully-associative backup cache to hide the evictions.
Score: 7.46215723037597
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Caches are used to reduce the speed differential between the CPU and memory to improve the performance of modern processors. However, attackers can use contention-based cache timing attacks to steal sensitive information from victim processes through carefully designed cache eviction sets. And L1 data cache attacks are widely exploited and pose a significant privacy and confidentiality threat. Existing hardware-based countermeasures mainly focus on cache partitioning, randomization, and cache line flushing, which unfortunately either incur high overhead or can be circumvented by sophisticated attacks. In this paper, we propose a novel hardware-software co-design called BackCache with the idea of always achieving cache hits instead of cache misses to mitigate contention-based cache timing attacks on the L1 data cache. BackCache places the evicted cache lines from the L1 data cache into a fully-associative backup cache to hide the evictions. To improve the security of BackCache, we introduce a randomly used replacement policy (RURP) and a dynamic backup cache resizing mechanism. We also present a theoretical security analysis to demonstrate the effectiveness of BackCache. Our evaluation on the gem5 simulator shows that BackCache can degrade the performance by 2.61%, 2.66%, and 3.36% For OS kernel, single-thread, and multi-thread benchmarks.

Related papers

FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation [46.57781555466333]
Diffusion Transformers (DiT) are powerful generative models but remain computationally intensive due to their iterative structure and deep transformer stacks.<n>FastCache is a hidden-state-level caching and compression framework that accelerates DiT inference.<n> Empirical evaluations across multiple DiT variants demonstrate substantial reductions in latency and memory usage.
arXiv Detail & Related papers (2025-05-26T05:58:49Z)
dKV-Cache: The Cache for Diffusion Language Models [53.85291644298835]
Diffusion Language Models (DLMs) have been seen as a promising competitor for autoregressive language models.<n>We propose a KV-cache-like mechanism, delayed KV-Cache, for the denoising process of DLMs.<n>Our approach is motivated by the observation that different tokens have distinct representation dynamics throughout the diffusion process.
arXiv Detail & Related papers (2025-05-21T17:32:10Z)
EXAM: Exploiting Exclusive System-Level Cache in Apple M-Series SoCs for Enhanced Cache Occupancy Attacks [2.198430261120653]
Cache occupancy attacks exploit the shared nature of cache hierarchies to infer a victim's activities by monitoring overall cache usage. We propose a suite of SLC-cache occupancy attacks, the first of its kind, where an adversary can monitor GPU and other CPU cluster activities from their own CPU cluster.
arXiv Detail & Related papers (2025-04-18T00:21:00Z)
Auditing Prompt Caching in Language Model APIs [77.02079451561718]
We investigate the privacy leakage caused by prompt caching in large language models (LLMs) We detect global cache sharing across users in seven API providers, including OpenAI. We find evidence that OpenAI's embedding model is a decoder-only Transformer, which was previously not publicly known.
arXiv Detail & Related papers (2025-02-11T18:58:04Z)
vCache: Verified Semantic Prompt Caching [75.87215136638828]
This paper proposes vCache, the first verified semantic cache with user-defined error rate guarantees.<n>It employs an online learning algorithm to estimate an optimal threshold for each cached prompt, enabling reliable cache responses without additional training.<n>Our experiments show that vCache consistently meets the specified error bounds while outperforming state-of-the-art static-threshold and fine-tuned embedding baselines.
arXiv Detail & Related papers (2025-02-06T04:16:20Z)
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality [58.80996741843102]
FasterCache is a training-free strategy designed to accelerate the inference of video diffusion models with high-quality generation. We show that FasterCache can significantly accelerate video generation while keeping video quality comparable to the baseline.
arXiv Detail & Related papers (2024-10-25T07:24:38Z)
RollingCache: Using Runtime Behavior to Defend Against Cache Side Channel Attacks [2.9221371172659616]
We present RollingCache, a cache design that defends against contention attacks by dynamically changing the set of addresses contending for cache sets. RollingCache does not rely on address encryption/decryption, data relocation, or cache partitioning. Our solution does not depend on having defined security domains, and can defend against an attacker running on the same or another core.
arXiv Detail & Related papers (2024-08-16T15:11:12Z)
Efficient Inference of Vision Instruction-Following Models with Elastic Cache [76.44955111634545]
We introduce Elastic Cache, a novel strategy for efficient deployment of instruction-following large vision-language models. We propose an importance-driven cache merging strategy to prune redundancy caches. For instruction encoding, we utilize the frequency to evaluate the importance of caches. Results on a range of LVLMs demonstrate that Elastic Cache not only boosts efficiency but also notably outperforms existing pruning methods in language generation.
arXiv Detail & Related papers (2024-07-25T15:29:05Z)
Hidden Web Caches Discovery [3.9272151228741716]
This paper presents a novel methodology for cache detection using timing analysis. Our approach eliminates the dependency on cache status headers, making it applicable to any web server.
arXiv Detail & Related papers (2024-07-23T08:58:06Z)
CacheSquash: Making caches speculation-aware [11.499924192220274]
Speculation is key to achieving high CPU performance, yet it enables risks like Spectre attacks.<n>We propose a novel mitigation, CacheSquash, that cancels mis-speculated memory accesses.<n>We implement CacheSquash on gem5 and show that it thwarts practical Spectre attacks, with near-zero performance overheads.
arXiv Detail & Related papers (2024-06-17T21:43:39Z)
SEA Cache: A Performance-Efficient Countermeasure for Contention-based Attacks [4.144828482272047]
We extend an existing secure cache design, CEASER-SH cache, and propose the SEA cache. The novel cache configurations in both caches are logical associativity, which allows the cache line to be placed not only in its mapped cache set but also in the subsequent cache sets. Compared to a CEASER-SH cache with logical associativity of 8, an SEA cache with logical associativity of 1 for normal protection users and 16 for high protection users has a Cycles Per Instruction penalty that is about 0.6% less for users under normal protections and provides better security against contention-based attacks
arXiv Detail & Related papers (2024-05-30T13:12:53Z)
CORM: Cache Optimization with Recent Message for Large Language Model Inference [57.109354287786154]
We introduce an innovative method for optimizing the KV cache, which considerably minimizes its memory footprint. CORM, a KV cache eviction policy, dynamically retains essential key-value pairs for inference without the need for model fine-tuning. Our validation shows that CORM reduces the inference memory usage of KV cache by up to 70% with negligible performance degradation across six tasks in LongBench.
arXiv Detail & Related papers (2024-04-24T16:11:54Z)
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference [78.65321721142624]
We focus on a memory bottleneck imposed by the key-value ( KV) cache. Existing KV cache methods approach this problem by pruning or evicting large swaths of relatively less important KV pairs. We propose LESS, a simple integration of a constant sized cache with eviction-based cache methods.
arXiv Detail & Related papers (2024-02-14T18:54:56Z)
On the Amplification of Cache Occupancy Attacks in Randomized Cache Architectures [11.018866935621045]
We show that MIRAGE, touted to be resilient against eviction-based attacks, amplifies the chances of cache occupancy attack. We leverage MIRAGE's global eviction property to demonstrate covert channel with byte-level granularity. We extend our attack vectors to include side-channel, template-based fingerprinting of workloads in a cross-core setting.
arXiv Detail & Related papers (2023-10-08T14:06:06Z)
Random and Safe Cache Architecture to Defeat Cache Timing Attacks [5.142233612851766]
Caches have been exploited to leak secret information due to the different times they take to handle memory accesses. We present a systematic view of the attack and defense space and show that no existing defense has addressed all cache timing attacks. We propose Random and Safe (RaS) cache architectures to decorrelate cache state changes from memory requests.
arXiv Detail & Related papers (2023-09-28T05:08:16Z)
Accelerating Deep Learning Classification with Error-controlled Approximate-key Caching [72.50506500576746]
We propose a novel caching paradigm, that we named approximate-key caching. While approximate cache hits alleviate DL inference workload and increase the system throughput, they however introduce an approximation error. We analytically model our caching system performance for classic LRU and ideal caches, we perform a trace-driven evaluation of the expected performance, and we compare the benefits of our proposed approach with the state-of-the-art similarity caching.
arXiv Detail & Related papers (2021-12-13T13:49:11Z)
Reinforcement Learning for Caching with Space-Time Popularity Dynamics [61.55827760294755]
caching is envisioned to play a critical role in next-generation networks. To intelligently prefetch and store contents, a cache node should be able to learn what and when to cache. This chapter presents a versatile reinforcement learning based approach for near-optimal caching policy design.
arXiv Detail & Related papers (2020-05-19T01:23:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.