Related papers: Compressing (Multidimensional) Learned Bloom Filters

Compressing (Multidimensional) Learned Bloom Filters

URL: http://arxiv.org/abs/2208.03029v1
Date: Fri, 5 Aug 2022 07:54:48 GMT
Title: Compressing (Multidimensional) Learned Bloom Filters
Authors: Angjela Davitkova, Damjan Gjurovski, Sebastian Michel
Abstract summary: A Bloom filter reveals if an element is not included in the underlying set or is included with a certain error rate. Deep learning models are used to solve this membership testing problem. We show that the benefits of learned Bloom filters are apparent only when considering a vast amount of data.
Score: 7.6058140480517356
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Bloom filters are widely used data structures that compactly represent sets of elements. Querying a Bloom filter reveals if an element is not included in the underlying set or is included with a certain error rate. This membership testing can be modeled as a binary classification problem and solved through deep learning models, leading to what is called learned Bloom filters. We have identified that the benefits of learned Bloom filters are apparent only when considering a vast amount of data, and even then, there is a possibility to further reduce their memory consumption. For that reason, we introduce a lossless input compression technique that improves the memory consumption of the learned model while preserving a comparable model accuracy. We evaluate our approach and show significant memory consumption improvements over learned Bloom filters.

Related papers

Cascaded Learned Bloom Filter for Optimal Model-Filter Size Balance and Fast Rejection [12.555117983678624]
We propose the Cascaded Learned Bloom Filter (CLBF) to address these issues. Our dynamic programming-based optimization automatically selects configurations that achieve an optimal balance between the model and filter sizes. Experiments on real-world datasets show that CLBF reduces memory usage by up to 24% and decreases reject time by up to 14 times compared to state-of-the-art learned Bloom filters.
arXiv Detail & Related papers (2025-02-06T01:05:41Z)
A Privacy Model for Classical & Learned Bloom Filters [0.0]
The Classical Bloom Filter (CBF) is a class of Probabilistic Data Structures (PDS) The Learned Bloom Filter (LBF) is a recently proposed class of PDS that combines the Classical Bloom Filter with a Learning Model. This paper conducts a rigorous differential privacy-based analysis for the Bloom Filter.
arXiv Detail & Related papers (2025-01-27T03:35:25Z)
Adversary Resilient Learned Bloom Filters [0.4910937238451484]
We define a strong adversarial model for the Learned Bloom Filter. Using our model, we construct an adversary-resilient variant of the Learned Bloom Filter called Downtown Bodega Filter.
arXiv Detail & Related papers (2024-09-10T14:37:43Z)
A Critical Analysis of Classifier Selection in Learned Bloom Filters [0.3359875577705538]
"Complexity" of the data used to build the filter might heavily impact on its performance. We propose a novel methodology, supported by software, for designing, analyzing and implementing Learned Bloom Filters. Experiments show that the proposed methodology and the supporting software are valid and useful.
arXiv Detail & Related papers (2022-11-28T17:17:18Z)
Unrolled Compressed Blind-Deconvolution [77.88847247301682]
sparse multichannel blind deconvolution (S-MBD) arises frequently in many engineering applications such as radar/sonar/ultrasound imaging. We propose a compression method that enables blind recovery from much fewer measurements with respect to the full received signal in time.
arXiv Detail & Related papers (2022-09-28T15:16:58Z)
On the Choice of General Purpose Classifiers in Learned Bloom Filters: An Initial Analysis Within Basic Filters [0.41998444721319217]
Several versions of Bloom Filters have been considered, yielding advantages over classic Filters. Each of them uses a classifier, which is the Learned part of the data structure. No systematic study of which specific classifier to use in which circumstances is available.
arXiv Detail & Related papers (2021-12-13T11:15:41Z)
Learning Versatile Convolution Filters for Efficient Visual Recognition [125.34595948003745]
This paper introduces versatile filters to construct efficient convolutional neural networks. We conduct theoretical analysis on network complexity and an efficient convolution scheme is introduced. Experimental results on benchmark datasets and neural networks demonstrate that our versatile filters are able to achieve comparable accuracy as that of original filters.
arXiv Detail & Related papers (2021-09-20T06:07:14Z)
Unsupervised Outlier Detection using Memory and Contrastive Learning [53.77693158251706]
We think outlier detection can be done in the feature space by measuring the feature distance between outliers and inliers. We propose a framework, MCOD, using a memory module and a contrastive learning module. Our proposed MCOD achieves a considerable performance and outperforms nine state-of-the-art methods.
arXiv Detail & Related papers (2021-07-27T07:35:42Z)
Data Agnostic Filter Gating for Efficient Deep Networks [72.4615632234314]
Current filter pruning methods mainly leverage feature maps to generate important scores for filters and prune those with smaller scores. In this paper, we propose a data filter pruning method that uses an auxiliary network named Dagger module to induce pruning. In addition, to help prune filters with certain FLOPs constraints, we leverage an explicit FLOPs-aware regularization to directly promote pruning filters toward target FLOPs.
arXiv Detail & Related papers (2020-10-28T15:26:40Z)
Training Interpretable Convolutional Neural Networks by Differentiating Class-specific Filters [64.46270549587004]
Convolutional neural networks (CNNs) have been successfully used in a range of tasks. CNNs are often viewed as "black-box" and lack of interpretability. We propose a novel strategy to train interpretable CNNs by encouraging class-specific filters.
arXiv Detail & Related papers (2020-07-16T09:12:26Z)
Partitioned Learned Bloom Filter [31.748077944821315]
We show how to frame the problem of optimal model utilization as an optimization problem. We derive algorithms that can achieve near-optimal performance in many cases.
arXiv Detail & Related papers (2020-06-05T00:05:32Z)
Filter Grafting for Deep Neural Networks: Reason, Method, and Cultivation [86.91324735966766]
Filter is the key component in modern convolutional neural networks (CNNs) In this paper, we introduce filter grafting (textbfMethod) to achieve this goal. We develop a novel criterion to measure the information of filters and an adaptive weighting strategy to balance the grafted information among networks.
arXiv Detail & Related papers (2020-04-26T08:36:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.