A\c{C}AI: Ascent Similarity Caching with Approximate Indexes
- URL: http://arxiv.org/abs/2107.00957v1
- Date: Fri, 2 Jul 2021 10:40:47 GMT
- Title: A\c{C}AI: Ascent Similarity Caching with Approximate Indexes
- Authors: Tareq Si Salem, Giovanni Neglia, Damiano Carra
- Abstract summary: Similarity search is a key operation in multimedia retrieval systems and recommender systems, and it will play an important role also for future machine learning and augmented reality applications.
AcCAI is a new similarity caching policy which improves on the state of the art by using (i) an (approximate) index for the whole catalog to decide which objects to serve locally and which to retrieve from the remote server, and (ii) a mirror ascent algorithm to update the set of local objects with strong guarantees even when the request process does not exhibit any statistical regularity.
- Score: 12.450760567361531
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Similarity search is a key operation in multimedia retrieval systems and
recommender systems, and it will play an important role also for future machine
learning and augmented reality applications. When these systems need to serve
large objects with tight delay constraints, edge servers close to the end-user
can operate as similarity caches to speed up the retrieval. In this paper we
present A\c{C}AI, a new similarity caching policy which improves on the state
of the art by using (i) an (approximate) index for the whole catalog to decide
which objects to serve locally and which to retrieve from the remote server,
and (ii) a mirror ascent algorithm to update the set of local objects with
strong guarantees even when the request process does not exhibit any
statistical regularity.
Related papers
- EPIC: Efficient Position-Independent Context Caching for Serving Large Language Models [19.510078997414606]
EPIC introduces position-independent context caching for large language models.
EPIC delivers up to 8x improvements in TTFT and 7x throughput over existing systems.
arXiv Detail & Related papers (2024-10-20T08:42:29Z) - Attention-Enhanced Prioritized Proximal Policy Optimization for Adaptive Edge Caching [4.2579244769567675]
We introduce a Proximal Policy Optimization (PPO)-based caching strategy that fully considers file attributes like lifetime, size, and priority.
Our method outperforms a recent Deep Reinforcement Learning-based technique.
arXiv Detail & Related papers (2024-02-08T17:17:46Z) - Efficient Reinforcement Learning for Routing Jobs in Heterogeneous Queueing Systems [21.944723061337267]
We consider the problem of efficiently routing jobs that arrive into a central queue to a system of heterogeneous servers.
Unlike homogeneous systems, a threshold policy, that routes jobs to the slow server(s) when the queue length exceeds a certain threshold, is known to be optimal for the one-fast-one-slow two-server system.
We propose ACHQ, an efficient policy gradient based algorithm with a low dimensional soft threshold policy parameterization.
arXiv Detail & Related papers (2024-02-02T05:22:41Z) - Efficient Cloud-edge Collaborative Inference for Object
Re-identification [27.952445808987036]
We pioneer a cloud-edge collaborative inference framework for ReID systems.
We propose a distribution-aware correlation modeling network (DaCM) to make the desired image return to the cloud server.
DaCM embeds the spatial-temporal correlations implicitly included in the timestamps into a graph structure, and it can be applied in the cloud to regulate the size of the upload window.
arXiv Detail & Related papers (2024-01-04T02:56:50Z) - Temporal-aware Hierarchical Mask Classification for Video Semantic
Segmentation [62.275143240798236]
Video semantic segmentation dataset has limited categories per video.
Less than 10% of queries could be matched to receive meaningful gradient updates during VSS training.
Our method achieves state-of-the-art performance on the latest challenging VSS benchmark VSPW without bells and whistles.
arXiv Detail & Related papers (2023-09-14T20:31:06Z) - Accelerating Deep Learning Classification with Error-controlled
Approximate-key Caching [72.50506500576746]
We propose a novel caching paradigm, that we named approximate-key caching.
While approximate cache hits alleviate DL inference workload and increase the system throughput, they however introduce an approximation error.
We analytically model our caching system performance for classic LRU and ideal caches, we perform a trace-driven evaluation of the expected performance, and we compare the benefits of our proposed approach with the state-of-the-art similarity caching.
arXiv Detail & Related papers (2021-12-13T13:49:11Z) - MD-CSDNetwork: Multi-Domain Cross Stitched Network for Deepfake
Detection [80.83725644958633]
Current deepfake generation methods leave discriminative artifacts in the frequency spectrum of fake images and videos.
We present a novel approach, termed as MD-CSDNetwork, for combining the features in the spatial and frequency domains to mine a shared discriminative representation.
arXiv Detail & Related papers (2021-09-15T14:11:53Z) - CREPO: An Open Repository to Benchmark Credal Network Algorithms [78.79752265884109]
Credal networks are imprecise probabilistic graphical models based on, so-called credal, sets of probability mass functions.
A Java library called CREMA has been recently released to model, process and query credal networks.
We present CREPO, an open repository of synthetic credal networks, provided together with the exact results of inference tasks on these models.
arXiv Detail & Related papers (2021-05-10T07:31:59Z) - IRLI: Iterative Re-partitioning for Learning to Index [104.72641345738425]
Methods have to trade between obtaining high accuracy while maintaining load balance and scalability in distributed settings.
We propose a novel approach called IRLI, which iteratively partitions the items by learning the relevant buckets directly from the query-item relevance data.
We mathematically show that IRLI retrieves the correct item with high probability under very natural assumptions and provides superior load balancing.
arXiv Detail & Related papers (2021-03-17T23:13:25Z) - Caching Placement and Resource Allocation for Cache-Enabling UAV NOMA
Networks [87.6031308969681]
This article investigates the cache-enabling unmanned aerial vehicle (UAV) cellular networks with massive access capability supported by non-orthogonal multiple access (NOMA)
We formulate the long-term caching placement and resource allocation optimization problem for content delivery delay minimization as a Markov decision process (MDP)
We propose a Q-learning based caching placement and resource allocation algorithm, where the UAV learns and selects action with emphsoft $varepsilon$-greedy strategy to search for the optimal match between actions and states.
arXiv Detail & Related papers (2020-08-12T08:33:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.