Related papers: CacheMamba: Popularity Prediction for Mobile Edge Caching Networks via Selective State Spaces

CacheMamba: Popularity Prediction for Mobile Edge Caching Networks via Selective State Spaces

URL: http://arxiv.org/abs/2502.15746v1
Date: Sun, 09 Feb 2025 05:57:59 GMT
Title: CacheMamba: Popularity Prediction for Mobile Edge Caching Networks via Selective State Spaces
Authors: Ghazaleh Kianfar, Zohreh Hajiakhondi-Meybodi, Arash Mohammadi,
Abstract summary: Mobile Edge Caching (MEC) plays a pivotal role in mitigating latency in data-intensive services by dynamically caching frequently requested content on edge servers.<n>In this paper, we explore the problem of popularity prediction in MEC by utilizing historical time-series request data of intended files.<n>We propose CacheMamba model by employing Mamba, a state-space model (SSM)-based architecture, to identify the top-K files with the highest likelihood of being requested.
Score: 6.895209729810318
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Mobile Edge Caching (MEC) plays a pivotal role in mitigating latency in data-intensive services by dynamically caching frequently requested content on edge servers. This capability is critical for applications such as Augmented Reality (AR), Virtual Reality (VR), and Autonomous Vehicles (AV), where efficient content caching and accurate popularity prediction are essential for optimizing performance. In this paper, we explore the problem of popularity prediction in MEC by utilizing historical time-series request data of intended files, formulating this problem as a ranking task. To this aim, we propose CacheMamba model by employing Mamba, a state-space model (SSM)-based architecture, to identify the top-K files with the highest likelihood of being requested. We then benchmark the proposed model against a Transformer-based approach, demonstrating its superior performance in terms of cache-hit rate, Mean Average Precision (MAP), Normalized Discounted Cumulative Gain (NDCG), and Floating-Point Operations Per Second (FLOPS), particularly when dealing with longer sequences.

Related papers

Graph Federated Learning Based Proactive Content Caching in Edge Computing [5.492113449220096]
This paper proposes a Graph Federated Learning-based Proactive Content Caching scheme that enhances caching efficiency while preserving user privacy.<n>The proposed approach integrates federated learning and graph neural networks, enabling users to locally train Light Graph Convolutional Networks (LightGCN) to capture user-item relationships and predict content popularity.
arXiv Detail & Related papers (2025-02-07T08:48:06Z)
Efficient Inference of Vision Instruction-Following Models with Elastic Cache [76.44955111634545]
We introduce Elastic Cache, a novel strategy for efficient deployment of instruction-following large vision-language models. We propose an importance-driven cache merging strategy to prune redundancy caches. For instruction encoding, we utilize the frequency to evaluate the importance of caches. Results on a range of LVLMs demonstrate that Elastic Cache not only boosts efficiency but also notably outperforms existing pruning methods in language generation.
arXiv Detail & Related papers (2024-07-25T15:29:05Z)
CORM: Cache Optimization with Recent Message for Large Language Model Inference [57.109354287786154]
We introduce an innovative method for optimizing the KV cache, which considerably minimizes its memory footprint. CORM, a KV cache eviction policy, dynamically retains essential key-value pairs for inference without the need for model fine-tuning. Our validation shows that CORM reduces the inference memory usage of KV cache by up to 70% with negligible performance degradation across six tasks in LongBench.
arXiv Detail & Related papers (2024-04-24T16:11:54Z)
Attention-Enhanced Prioritized Proximal Policy Optimization for Adaptive Edge Caching [4.2579244769567675]
We introduce a Proximal Policy Optimization (PPO)-based caching strategy that fully considers file attributes like lifetime, size, and priority. Our method outperforms a recent Deep Reinforcement Learning-based technique.
arXiv Detail & Related papers (2024-02-08T17:17:46Z)
A Learning-Based Caching Mechanism for Edge Content Delivery [2.412158290827225]
5G networks and the rise of the Internet of Things (IoT) are increasingly extending into the network edge. This shift introduces unique challenges, particularly due to the limited cache storage and the diverse request patterns at the edge. We introduce HR-Cache, a learning-based caching framework grounded in the principles of Hazard Rate (HR) ordering.
arXiv Detail & Related papers (2024-02-05T08:06:03Z)
Multi-Content Time-Series Popularity Prediction with Multiple-Model Transformers in MEC Networks [34.44384973176474]
Coded/uncoded content placement in Mobile Edge Caching (MEC) has evolved to meet the significant growth of global mobile data traffic. Most existing datadriven popularity prediction models are not suitable for the coded/uncoded content placement frameworks. We develop a Multiple-model (hybrid) Transformer-based Edge Caching (MTEC) framework with higher generalization ability.
arXiv Detail & Related papers (2022-10-12T02:24:49Z)
Accelerating Deep Learning Classification with Error-controlled Approximate-key Caching [72.50506500576746]
We propose a novel caching paradigm, that we named approximate-key caching. While approximate cache hits alleviate DL inference workload and increase the system throughput, they however introduce an approximation error. We analytically model our caching system performance for classic LRU and ideal caches, we perform a trace-driven evaluation of the expected performance, and we compare the benefits of our proposed approach with the state-of-the-art similarity caching.
arXiv Detail & Related papers (2021-12-13T13:49:11Z)
Distributed Reinforcement Learning for Privacy-Preserving Dynamic Edge Caching [91.50631418179331]
A privacy-preserving distributed deep policy gradient (P2D3PG) is proposed to maximize the cache hit rates of devices in the MEC networks. We convert the distributed optimizations into model-free Markov decision process problems and then introduce a privacy-preserving federated learning method for popularity prediction.
arXiv Detail & Related papers (2021-10-20T02:48:27Z)
Caching Placement and Resource Allocation for Cache-Enabling UAV NOMA Networks [87.6031308969681]
This article investigates the cache-enabling unmanned aerial vehicle (UAV) cellular networks with massive access capability supported by non-orthogonal multiple access (NOMA) We formulate the long-term caching placement and resource allocation optimization problem for content delivery delay minimization as a Markov decision process (MDP) We propose a Q-learning based caching placement and resource allocation algorithm, where the UAV learns and selects action with emphsoft $varepsilon$-greedy strategy to search for the optimal match between actions and states.
arXiv Detail & Related papers (2020-08-12T08:33:51Z)
Reinforcement Learning for Caching with Space-Time Popularity Dynamics [61.55827760294755]
caching is envisioned to play a critical role in next-generation networks. To intelligently prefetch and store contents, a cache node should be able to learn what and when to cache. This chapter presents a versatile reinforcement learning based approach for near-optimal caching policy design.
arXiv Detail & Related papers (2020-05-19T01:23:51Z)
PA-Cache: Evolving Learning-Based Popularity-Aware Content Caching in Edge Networks [14.939950326112045]
We propose an evolving learning-based content caching policy, named PA-Cache in edge networks. It adaptively learns time-varying content popularity and determines which contents should be replaced when the cache is full. We extensively evaluate the performance of our proposed PA-Cache on real-world traces from a large online video-on-demand service provider.
arXiv Detail & Related papers (2020-02-20T15:38:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.