Related papers: Improved Estimation of Concentration Under $\ell_p$-Norm Distance Metrics Using Half Spaces

Improved Estimation of Concentration Under $\ell_p$-Norm Distance Metrics Using Half Spaces

URL: http://arxiv.org/abs/2103.12913v1
Date: Wed, 24 Mar 2021 01:16:28 GMT
Title: Improved Estimation of Concentration Under $\ell_p$-Norm Distance Metrics Using Half Spaces
Authors: Jack Prescott, Xiao Zhang, David Evans
Abstract summary: Concentration of measure has been argued to be the fundamental cause of adversarial vulnerability. We propose a method to estimate the concentration of any empirical dataset under $ell_p$-norm distance metrics. Our proposed algorithm is more efficient than Mahloujifar et al.'s, and our experiments on synthetic datasets and image benchmarks demonstrate that it is able to find much tighter intrinsic robustness bounds.
Score: 14.947511752748005
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Concentration of measure has been argued to be the fundamental cause of adversarial vulnerability. Mahloujifar et al. presented an empirical way to measure the concentration of a data distribution using samples, and employed it to find lower bounds on intrinsic robustness for several benchmark datasets. However, it remains unclear whether these lower bounds are tight enough to provide a useful approximation for the intrinsic robustness of a dataset. To gain a deeper understanding of the concentration of measure phenomenon, we first extend the Gaussian Isoperimetric Inequality to non-spherical Gaussian measures and arbitrary $\ell_p$-norms ($p \geq 2$). We leverage these theoretical insights to design a method that uses half-spaces to estimate the concentration of any empirical dataset under $\ell_p$-norm distance metrics. Our proposed algorithm is more efficient than Mahloujifar et al.'s, and our experiments on synthetic datasets and image benchmarks demonstrate that it is able to find much tighter intrinsic robustness bounds. These tighter estimates provide further evidence that rules out intrinsic dataset concentration as a possible explanation for the adversarial vulnerability of state-of-the-art classifiers.

Related papers

Median of Forests for Robust Density Estimation [6.696552548865382]
We propose an ensemble learning algorithm called textitmedians of forests for robust density estimation (textitMFRDE) MFRDE enables us to choose larger subsampling sizes, sacrificing less accuracy for density estimation while achieving robustness. On the practical side, real data experiments show that MFRDE outperforms existing robust kernel-based methods.
arXiv Detail & Related papers (2025-01-25T09:45:51Z)
Robust Barycenter Estimation using Semi-Unbalanced Neural Optimal Transport [84.51977664336056]
We propose a novel, scalable approach for estimating the textitrobust continuous barycenter. Our method is framed as a $min$-$max$ optimization problem and is adaptable to textitgeneral cost function.
arXiv Detail & Related papers (2024-10-04T23:27:33Z)
A quasi-Bayesian sequential approach to deconvolution density estimation [7.10052009802944]
Density deconvolution addresses the estimation of the unknown density function $f$ of a random signal from data. We consider the problem of density deconvolution in a streaming or online setting where noisy data arrive progressively. By relying on a quasi-Bayesian sequential approach, we obtain estimates of $f$ that are of easy evaluation.
arXiv Detail & Related papers (2024-08-26T16:40:04Z)
ConjNorm: Tractable Density Estimation for Out-of-Distribution Detection [41.41164637577005]
Post-hoc out-of-distribution (OOD) detection has garnered intensive attention in reliable machine learning. We propose a novel theoretical framework grounded in Bregman divergence to provide a unified perspective on density-based score design. We show that our proposed textscConjNorm has established a new state-of-the-art in a variety of OOD detection setups.
arXiv Detail & Related papers (2024-02-27T21:02:47Z)
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets [101.5329678997916]
We study episodic two-player zero-sum Markov games (MGs) in the offline setting. The goal is to find an approximate Nash equilibrium (NE) policy pair based on a dataset collected a priori.
arXiv Detail & Related papers (2022-02-15T15:39:30Z)
Nystr\"om Kernel Mean Embeddings [92.10208929236826]
We propose an efficient approximation procedure based on the Nystr"om method. It yields sufficient conditions on the subsample size to obtain the standard $n-1/2$ rate. We discuss applications of this result for the approximation of the maximum mean discrepancy and quadrature rules.
arXiv Detail & Related papers (2022-01-31T08:26:06Z)
Uniform Concentration Bounds toward a Unified Framework for Robust Clustering [21.789311405437573]
Recent advances in center-based clustering continue to improve upon the drawbacks of Lloyd's celebrated $k$-means algorithm. Various methods seek to address poor local minima, sensitivity to outliers, and data that are not well-suited to Euclidean measures of fit. This paper proposes a cohesive robust framework for center-based clustering under a general class of dissimilarity measures.
arXiv Detail & Related papers (2021-10-27T03:43:44Z)
Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models. In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints. A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z)
Keep it Tighter -- A Story on Analytical Mean Embeddings [0.6445605125467574]
Kernel techniques are among the most popular and flexible approaches in data science. Mean embedding gives rise to a divergence measure referred to as maximum mean discrepancy (MMD) In this paper we focus on the problem of MMD estimation when the mean embedding of one of the underlying distributions is available analytically.
arXiv Detail & Related papers (2021-10-15T21:29:27Z)
Density-Based Clustering with Kernel Diffusion [59.4179549482505]
A naive density corresponding to the indicator function of a unit $d$-dimensional Euclidean ball is commonly used in density-based clustering algorithms. We propose a new kernel diffusion density function, which is adaptive to data of varying local distributional characteristics and smoothness.
arXiv Detail & Related papers (2021-10-11T09:00:33Z)
Featurized Density Ratio Estimation [82.40706152910292]
In our work, we propose to leverage an invertible generative model to map the two distributions into a common feature space prior to estimation. This featurization brings the densities closer together in latent space, sidestepping pathological scenarios where the learned density ratios in input space can be arbitrarily inaccurate. At the same time, the invertibility of our feature map guarantees that the ratios computed in feature space are equivalent to those in input space.
arXiv Detail & Related papers (2021-07-05T18:30:26Z)
Nonparametric Density Estimation from Markov Chains [68.8204255655161]
We introduce a new nonparametric density estimator inspired by Markov Chains, and generalizing the well-known Kernel Density Estor. Our estimator presents several benefits with respect to the usual ones and can be used straightforwardly as a foundation in all density-based algorithms.
arXiv Detail & Related papers (2020-09-08T18:33:42Z)
Robust Kernel Density Estimation with Median-of-Means principle [0.0]
We introduce a robust nonparametric density estimator combining the popular Kernel Density Estimation method and the Median-of-Means principle (MoM-KDE) This estimator is shown to achieve robustness to any kind of anomalous data, even in the case of adversarial contamination.
arXiv Detail & Related papers (2020-06-30T08:01:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.