Online Caching with Optimal Switching Regret
- URL: http://arxiv.org/abs/2101.07043v1
- Date: Mon, 18 Jan 2021 12:47:22 GMT
- Title: Online Caching with Optimal Switching Regret
- Authors: Samrat Mukhopadhyay, Abhishek Sinha
- Abstract summary: A cache of limited storage capacity can hold $C$ files at a time from a large catalog.
In the case of a cache-hit, the policy receives a unit reward and zero rewards otherwise.
The objective is to design a caching policy that incurs minimal regret while considering both the rewards due to cache-hits and the switching cost due to the file fetches.
- Score: 10.447270433913134
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the classical uncoded caching problem from an online learning
point-of-view. A cache of limited storage capacity can hold $C$ files at a time
from a large catalog. A user requests an arbitrary file from the catalog at
each time slot. Before the file request from the user arrives, a caching policy
populates the cache with any $C$ files of its choice. In the case of a
cache-hit, the policy receives a unit reward and zero rewards otherwise. In
addition to that, there is a cost associated with fetching files to the cache,
which we refer to as the switching cost. The objective is to design a caching
policy that incurs minimal regret while considering both the rewards due to
cache-hits and the switching cost due to the file fetches. The main
contribution of this paper is the switching regret analysis of a Follow the
Perturbed Leader-based anytime caching policy, which is shown to have an order
optimal switching regret. In this pursuit, we improve the best-known switching
regret bound for this problem by a factor of $\Theta(\sqrt{C}).$ We conclude
the paper by comparing the performance of different popular caching policies
using a publicly available trace from a commercial CDN server.
Related papers
- On the Regret of Coded Caching with Adversarial Requests [7.171698704686835]
We study the well-known coded caching problem in an online learning framework, wherein requests arrive sequentially.
We introduce a caching policy based on the Follow-The-Perturbed-Leader principle and show that for any time horizon T, it achieves a sub-linear regret of mathcalO(sqrt(T).
arXiv Detail & Related papers (2024-09-19T01:13:03Z) - Efficient Inference of Vision Instruction-Following Models with Elastic Cache [76.44955111634545]
We introduce Elastic Cache, a novel strategy for efficient deployment of instruction-following large vision-language models.
We propose an importance-driven cache merging strategy to prune redundancy caches.
For instruction encoding, we utilize the frequency to evaluate the importance of caches.
Results on a range of LVLMs demonstrate that Elastic Cache not only boosts efficiency but also notably outperforms existing pruning methods in language generation.
arXiv Detail & Related papers (2024-07-25T15:29:05Z) - Training-Free Exponential Context Extension via Cascading KV Cache [49.608367376911694]
We introduce a novel mechanism that leverages cascading sub-cache buffers to selectively retain the most relevant tokens.
Our method reduces prefill stage latency by a factor of 6.8 when compared to flash attention on 1M tokens.
arXiv Detail & Related papers (2024-06-24T03:59:17Z) - Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference [78.65321721142624]
We focus on a memory bottleneck imposed by the key-value ( KV) cache.
Existing KV cache methods approach this problem by pruning or evicting large swaths of relatively less important KV pairs.
We propose LESS, a simple integration of a constant sized cache with eviction-based cache methods.
arXiv Detail & Related papers (2024-02-14T18:54:56Z) - A Learning-Based Caching Mechanism for Edge Content Delivery [2.412158290827225]
5G networks and the rise of the Internet of Things (IoT) are increasingly extending into the network edge.
This shift introduces unique challenges, particularly due to the limited cache storage and the diverse request patterns at the edge.
We introduce HR-Cache, a learning-based caching framework grounded in the principles of Hazard Rate (HR) ordering.
arXiv Detail & Related papers (2024-02-05T08:06:03Z) - Optimistic No-regret Algorithms for Discrete Caching [6.182368229968862]
We take a systematic look at the problem of storing whole files in a cache with limited capacity in the context of optimistic learning.
We provide a universal lower bound for prediction-assisted online caching and design a suite of policies with a range of performance-complexity trade-offs.
Our results substantially improve upon all recently-proposed online caching policies, which, being unable to exploit the oracle predictions, offer only $O(sqrtT)$ regret.
arXiv Detail & Related papers (2022-08-15T09:18:41Z) - Accelerating Deep Learning Classification with Error-controlled
Approximate-key Caching [72.50506500576746]
We propose a novel caching paradigm, that we named approximate-key caching.
While approximate cache hits alleviate DL inference workload and increase the system throughput, they however introduce an approximation error.
We analytically model our caching system performance for classic LRU and ideal caches, we perform a trace-driven evaluation of the expected performance, and we compare the benefits of our proposed approach with the state-of-the-art similarity caching.
arXiv Detail & Related papers (2021-12-13T13:49:11Z) - LeadCache: Regret-Optimal Caching in Networks [8.208569626646034]
We propose an efficient online caching policy based on the Follow-the-Perturbed-Leader paradigm.
We show that $textttLeadCache$ is regret-optimal up to a factor $tildeO(n3/8), where $n$ is the number of users.
arXiv Detail & Related papers (2020-09-17T12:13:26Z) - Reinforcement Learning for Caching with Space-Time Popularity Dynamics [61.55827760294755]
caching is envisioned to play a critical role in next-generation networks.
To intelligently prefetch and store contents, a cache node should be able to learn what and when to cache.
This chapter presents a versatile reinforcement learning based approach for near-optimal caching policy design.
arXiv Detail & Related papers (2020-05-19T01:23:51Z) - A Non-Stationary Bandit-Learning Approach to Energy-Efficient
Femto-Caching with Rateless-Coded Transmission [98.47527781626161]
We study a resource allocation problem for joint caching and transmission in small cell networks.
We then formulate the problem as to select a file from the cache together with a transmission power level for every broadcast round.
In contrast to the state-of-the-art research, the proposed approach is especially suitable for networks with time-variant statistical properties.
arXiv Detail & Related papers (2020-04-13T09:07:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.