On the Choice of General Purpose Classifiers in Learned Bloom Filters:
An Initial Analysis Within Basic Filters
- URL: http://arxiv.org/abs/2112.06563v1
- Date: Mon, 13 Dec 2021 11:15:41 GMT
- Title: On the Choice of General Purpose Classifiers in Learned Bloom Filters:
An Initial Analysis Within Basic Filters
- Authors: Giacomo Fumagalli, Davide Raimondi, Raffaele Giancarlo, Dario
Malchiodi, Marco Frasca
- Abstract summary: Several versions of Bloom Filters have been considered, yielding advantages over classic Filters.
Each of them uses a classifier, which is the Learned part of the data structure.
No systematic study of which specific classifier to use in which circumstances is available.
- Score: 0.41998444721319217
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bloom Filters are a fundamental and pervasive data structure. Within the
growing area of Learned Data Structures, several Learned versions of Bloom
Filters have been considered, yielding advantages over classic Filters. Each of
them uses a classifier, which is the Learned part of the data structure.
Although it has a central role in those new filters, and its space footprint as
well as classification time may affect the performance of the Learned Filter,
no systematic study of which specific classifier to use in which circumstances
is available. We report progress in this area here, providing also initial
guidelines on which classifier to choose among five classic classification
paradigms.
Related papers
- Who's in and who's out? A case study of multimodal CLIP-filtering in DataComp [13.749279800238092]
We show that image-text data filtering has biases and is value-laden.
Data relating to several imputed demographic groups are associated with higher rates of exclusion.
Our conclusions point to a need for fundamental changes in dataset creation and filtering practices.
arXiv Detail & Related papers (2024-05-13T21:53:06Z) - Graph-based Extreme Feature Selection for Multi-class Classification
Tasks [7.863638253070439]
This work focuses on a graph-based, filter feature selection method that is suited for multi-class classifications tasks.
We aim to drastically reduce the number of selected features, in order to create a sketch of the original data that codes valuable information for the classification task.
arXiv Detail & Related papers (2023-03-03T09:06:35Z) - Anomaly Detection using Ensemble Classification and Evidence Theory [62.997667081978825]
We present a novel approach for novel detection using ensemble classification and evidence theory.
A pool selection strategy is presented to build a solid ensemble classifier.
We use uncertainty for the anomaly detection approach.
arXiv Detail & Related papers (2022-12-23T00:50:41Z) - A Critical Analysis of Classifier Selection in Learned Bloom Filters [0.3359875577705538]
"Complexity" of the data used to build the filter might heavily impact on its performance.
We propose a novel methodology, supported by software, for designing, analyzing and implementing Learned Bloom Filters.
Experiments show that the proposed methodology and the supporting software are valid and useful.
arXiv Detail & Related papers (2022-11-28T17:17:18Z) - Graph Filters for Signal Processing and Machine Learning on Graphs [83.29608206147515]
We provide a comprehensive overview of graph filters, including the different filtering categories, design strategies for each type, and trade-offs between different types of graph filters.
We discuss how to extend graph filters into filter banks and graph neural networks to enhance the representational power.
Our aim is that this article provides a unifying framework for both beginner and experienced researchers, as well as a common understanding.
arXiv Detail & Related papers (2022-11-16T11:56:45Z) - Compressing (Multidimensional) Learned Bloom Filters [7.6058140480517356]
A Bloom filter reveals if an element is not included in the underlying set or is included with a certain error rate.
Deep learning models are used to solve this membership testing problem.
We show that the benefits of learned Bloom filters are apparent only when considering a vast amount of data.
arXiv Detail & Related papers (2022-08-05T07:54:48Z) - Learning Versatile Convolution Filters for Efficient Visual Recognition [125.34595948003745]
This paper introduces versatile filters to construct efficient convolutional neural networks.
We conduct theoretical analysis on network complexity and an efficient convolution scheme is introduced.
Experimental results on benchmark datasets and neural networks demonstrate that our versatile filters are able to achieve comparable accuracy as that of original filters.
arXiv Detail & Related papers (2021-09-20T06:07:14Z) - Sequence-Based Filtering for Visual Route-Based Navigation: Analysing
the Benefits, Trade-offs and Design Choices [17.48671856442762]
An emerging trend in Visual Place Recognition (VPR) is the use of sequence-based filtering methods on top of single-frame-based place matching techniques.
This paper conducts an in-depth investigation of the relationship between the performance of single-frame-based place matching techniques and the use of sequence-based filtering on top of those methods.
arXiv Detail & Related papers (2021-03-02T19:24:58Z) - Training Interpretable Convolutional Neural Networks by Differentiating
Class-specific Filters [64.46270549587004]
Convolutional neural networks (CNNs) have been successfully used in a range of tasks.
CNNs are often viewed as "black-box" and lack of interpretability.
We propose a novel strategy to train interpretable CNNs by encouraging class-specific filters.
arXiv Detail & Related papers (2020-07-16T09:12:26Z) - Deep Learning feature selection to unhide demographic recommender
systems factors [63.732639864601914]
The matrix factorization model generates factors which do not incorporate semantic knowledge.
DeepUnHide is able to extract demographic information from the users and items factors in collaborative filtering recommender systems.
arXiv Detail & Related papers (2020-06-17T17:36:48Z) - Dependency Aware Filter Pruning [74.69495455411987]
Pruning a proportion of unimportant filters is an efficient way to mitigate the inference cost.
Previous work prunes filters according to their weight norms or the corresponding batch-norm scaling factors.
We propose a novel mechanism to dynamically control the sparsity-inducing regularization so as to achieve the desired sparsity.
arXiv Detail & Related papers (2020-05-06T07:41:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.