Dynamic Maintenance of Kernel Density Estimation Data Structure: From
Practice to Theory
- URL: http://arxiv.org/abs/2208.03915v2
- Date: Tue, 13 Feb 2024 19:59:56 GMT
- Title: Dynamic Maintenance of Kernel Density Estimation Data Structure: From
Practice to Theory
- Authors: Jiehao Liang, Zhao Song, Zhaozhuo Xu, Junze Yin, Danyang Zhuo
- Abstract summary: Kernel density estimation (KDE) stands out as a challenging task in machine learning.
Recently, there has been a growing trend of using data structures for efficient KDE.
In this work, we focus on the dynamic maintenance of KDE data structures with robustness to adversarial queries.
- Score: 16.460386321121945
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Kernel density estimation (KDE) stands out as a challenging task in machine
learning. The problem is defined in the following way: given a kernel function
$f(x,y)$ and a set of points $\{x_1, x_2, \cdots, x_n \} \subset \mathbb{R}^d$,
we would like to compute $\frac{1}{n}\sum_{i=1}^{n} f(x_i,y)$ for any query
point $y \in \mathbb{R}^d$. Recently, there has been a growing trend of using
data structures for efficient KDE. However, the proposed KDE data structures
focus on static settings. The robustness of KDE data structures over dynamic
changing data distributions is not addressed. In this work, we focus on the
dynamic maintenance of KDE data structures with robustness to adversarial
queries. Especially, we provide a theoretical framework of KDE data structures.
In our framework, the KDE data structures only require subquadratic spaces.
Moreover, our data structure supports the dynamic update of the dataset in
sublinear time. Furthermore, we can perform adaptive queries with the potential
adversary in sublinear time.
Related papers
- Statistical-Computational Trade-offs for Density Estimation [60.81548752871115]
We show that for a broad class of data structures their bounds cannot be significantly improved.
This is a novel emphstatistical-computational trade-off for density estimation.
arXiv Detail & Related papers (2024-10-30T15:03:33Z) - Conditional Density Estimation with Histogram Trees [3.5297361401370044]
Conditional density estimation (CDE) goes beyond regression by modeling the full conditional distribution.
Current methods typically employ kernel-based approaches, using kernel functions directly for kernel density estimation or as basis functions in linear models.
We propose the Conditional Density Tree (CDTree), a fully non-parametric model consisting of a decision tree in which each leaf is formed by a histogram model.
arXiv Detail & Related papers (2024-10-15T09:53:24Z) - MOKD: Cross-domain Finetuning for Few-shot Classification via Maximizing Optimized Kernel Dependence [97.93517982908007]
In cross-domain few-shot classification, NCC aims to learn representations to construct a metric space where few-shot classification can be performed.
In this paper, we find that there exist high similarities between NCC-learned representations of two samples from different classes.
We propose a bi-level optimization framework, emphmaximizing optimized kernel dependence (MOKD) to learn a set of class-specific representations that match the cluster structures indicated by labeled data.
arXiv Detail & Related papers (2024-05-29T05:59:52Z) - Deep Equilibrium Based Neural Operators for Steady-State PDEs [100.88355782126098]
We study the benefits of weight-tied neural network architectures for steady-state PDEs.
We propose FNO-DEQ, a deep equilibrium variant of the FNO architecture that directly solves for the solution of a steady-state PDE.
arXiv Detail & Related papers (2023-11-30T22:34:57Z) - Fast Private Kernel Density Estimation via Locality Sensitive
Quantization [10.227538355037554]
We study efficient mechanisms for differentially private kernel density estimation (DP-KDE)
We show how the kernel can privately be approximated in time linear in $d$, making it feasible for high-dimensional data.
arXiv Detail & Related papers (2023-07-04T18:48:04Z) - Differentiable and Transportable Structure Learning [73.84540901950616]
We introduce D-Struct, which recovers transportability in the discovered structures through a novel architecture and loss function.
Because D-Struct remains differentiable, our method can be easily adopted in existing differentiable architectures.
arXiv Detail & Related papers (2022-06-13T17:50:53Z) - ISDE : Independence Structure Density Estimation [0.0]
Multidimensional density estimation suffers from the curse of dimensionality.
We propose ISDE (Independence Structure Density Estimation), an algorithm designed to estimate a density and an undirected graphical model.
We show how it performs both quantitatively and qualitatively on real-world datasets.
arXiv Detail & Related papers (2022-03-18T08:01:04Z) - Generative Adversarial Learning via Kernel Density Discrimination [32.91091065436645]
We introduce Kernel Density Discrimination GAN (KDD GAN), a novel method for generative adversarial learning.
We define the Kernel Density Estimates directly in feature space and forgo the requirement of invertibility of the kernel feature mappings.
We show a boost in the quality of generated samples with respect to FID from 10% to 40% compared to the baseline.
arXiv Detail & Related papers (2021-07-13T15:52:10Z) - DEANN: Speeding up Kernel-Density Estimation using Approximate Nearest
Neighbor Search [8.25574589820305]
We present an algorithm called Density Estimation from Approximate Nearest Neighbors (DEANN)
We apply ANN algorithms as a black box subroutine to compute an unbiased Density Estimation (KDE)
We show that our implementation outperforms state of the art implementations in all high dimensional datasets we considered.
arXiv Detail & Related papers (2021-07-06T17:11:28Z) - Deep Parametric Continuous Convolutional Neural Networks [92.87547731907176]
Parametric Continuous Convolution is a new learnable operator that operates over non-grid structured data.
Our experiments show significant improvement over the state-of-the-art in point cloud segmentation of indoor and outdoor scenes.
arXiv Detail & Related papers (2021-01-17T18:28:23Z) - Improved guarantees and a multiple-descent curve for Column Subset
Selection and the Nystr\"om method [76.73096213472897]
We develop techniques which exploit spectral properties of the data matrix to obtain improved approximation guarantees.
Our approach leads to significantly better bounds for datasets with known rates of singular value decay.
We show that both our improved bounds and the multiple-descent curve can be observed on real datasets simply by varying the RBF parameter.
arXiv Detail & Related papers (2020-02-21T00:43:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.