No-Regret Caching via Online Mirror Descent
- URL: http://arxiv.org/abs/2101.12588v5
- Date: Tue, 6 Jun 2023 14:06:09 GMT
- Title: No-Regret Caching via Online Mirror Descent
- Authors: T. Si Salem, G. Neglia and S. Ioannidis
- Abstract summary: We study an online caching problem in which requests can be served by a local cache to avoid retrieval costs from a remote server.
We show that bounds for the regret crucially depend on the diversity of the request process, provided by the diversity ratio R/h.
We also prove that, when the cache must store the entire file, rather than a fraction, OMD strategies can be coupled with a randomized rounding scheme that preserves regret guarantees.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study an online caching problem in which requests can be served by a local
cache to avoid retrieval costs from a remote server. The cache can update its
state after a batch of requests and store an arbitrarily small fraction of each
file. We study no-regret algorithms based on Online Mirror Descent (OMD)
strategies. We show that bounds for the regret crucially depend on the
diversity of the request process, provided by the diversity ratio R/h, where R
is the size of the batch, and h is the maximum multiplicity of a request in a
given batch. We characterize the optimality of OMD caching policies w.r.t.
regret under different diversity regimes. We also prove that, when the cache
must store the entire file, rather than a fraction, OMD strategies can be
coupled with a randomized rounding scheme that preserves regret guarantees,
even when update costs cannot be neglected. We provide a formal
characterization of the rounding problem through optimal transport theory, and
moreover we propose a computationally efficient randomized rounding scheme.
Related papers
- On the Regret of Coded Caching with Adversarial Requests [7.171698704686835]
We study the well-known coded caching problem in an online learning framework, wherein requests arrive sequentially.
We introduce a caching policy based on the Follow-The-Perturbed-Leader principle and show that for any time horizon T, it achieves a sub-linear regret of mathcalO(sqrt(T).
arXiv Detail & Related papers (2024-09-19T01:13:03Z) - Efficient Inference of Vision Instruction-Following Models with Elastic Cache [76.44955111634545]
We introduce Elastic Cache, a novel strategy for efficient deployment of instruction-following large vision-language models.
We propose an importance-driven cache merging strategy to prune redundancy caches.
For instruction encoding, we utilize the frequency to evaluate the importance of caches.
Results on a range of LVLMs demonstrate that Elastic Cache not only boosts efficiency but also notably outperforms existing pruning methods in language generation.
arXiv Detail & Related papers (2024-07-25T15:29:05Z) - Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference [78.65321721142624]
We focus on a memory bottleneck imposed by the key-value ( KV) cache.
Existing KV cache methods approach this problem by pruning or evicting large swaths of relatively less important KV pairs.
We propose LESS, a simple integration of a constant sized cache with eviction-based cache methods.
arXiv Detail & Related papers (2024-02-14T18:54:56Z) - No-Regret Caching with Noisy Request Estimates [12.603423174002254]
We propose the Noisy-Follow-the-Perturbed-Leader (NFPL) algorithm, a variant of the classic Follow-the-Perturbed-Leader (FPL) when request estimates are noisy.
We show that the proposed solution has sublinear regret under specific conditions on the requests estimator.
arXiv Detail & Related papers (2023-09-05T08:57:35Z) - Unrolled Compressed Blind-Deconvolution [77.88847247301682]
sparse multichannel blind deconvolution (S-MBD) arises frequently in many engineering applications such as radar/sonar/ultrasound imaging.
We propose a compression method that enables blind recovery from much fewer measurements with respect to the full received signal in time.
arXiv Detail & Related papers (2022-09-28T15:16:58Z) - Optimistic No-regret Algorithms for Discrete Caching [6.182368229968862]
We take a systematic look at the problem of storing whole files in a cache with limited capacity in the context of optimistic learning.
We provide a universal lower bound for prediction-assisted online caching and design a suite of policies with a range of performance-complexity trade-offs.
Our results substantially improve upon all recently-proposed online caching policies, which, being unable to exploit the oracle predictions, offer only $O(sqrtT)$ regret.
arXiv Detail & Related papers (2022-08-15T09:18:41Z) - Accelerating Deep Learning Classification with Error-controlled
Approximate-key Caching [72.50506500576746]
We propose a novel caching paradigm, that we named approximate-key caching.
While approximate cache hits alleviate DL inference workload and increase the system throughput, they however introduce an approximation error.
We analytically model our caching system performance for classic LRU and ideal caches, we perform a trace-driven evaluation of the expected performance, and we compare the benefits of our proposed approach with the state-of-the-art similarity caching.
arXiv Detail & Related papers (2021-12-13T13:49:11Z) - ARCH: Efficient Adversarial Regularized Training with Caching [91.74682538906691]
Adversarial regularization can improve model generalization in many natural language processing tasks.
We propose a new adversarial regularization method ARCH, where perturbations are generated and cached once every several epochs.
We evaluate our proposed method on a set of neural machine translation and natural language understanding tasks.
arXiv Detail & Related papers (2021-09-15T02:05:37Z) - Cache Replacement as a MAB with Delayed Feedback and Decaying Costs [4.358626952482686]
We propose and solve a new variant of the well-known multi-armed bandit (MAB)
Each arm represents a distinct cache replacement policy, which advises on the page to evict from the cache when needed.
We introduce an adaptive reinforcement learning algorithm EXP4-DFDC that provides a solution to the problem.
arXiv Detail & Related papers (2020-09-23T18:26:48Z) - A Non-Stationary Bandit-Learning Approach to Energy-Efficient
Femto-Caching with Rateless-Coded Transmission [98.47527781626161]
We study a resource allocation problem for joint caching and transmission in small cell networks.
We then formulate the problem as to select a file from the cache together with a transmission power level for every broadcast round.
In contrast to the state-of-the-art research, the proposed approach is especially suitable for networks with time-variant statistical properties.
arXiv Detail & Related papers (2020-04-13T09:07:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.