Weighted least-squares approximation with determinantal point processes and generalized volume sampling
- URL: http://arxiv.org/abs/2312.14057v3
- Date: Thu, 21 Mar 2024 08:29:32 GMT
- Title: Weighted least-squares approximation with determinantal point processes and generalized volume sampling
- Authors: Anthony Nouy, Bertrand Michel,
- Abstract summary: We consider the problem of approximating a function from $L2$ by an element of a given $m$-dimensional space $V_m$.
We show that the approximation is almost surely bounded by the best approximation error measured in the $H$-norm.
- Score: 33.33724208084121
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of approximating a function from $L^2$ by an element of a given $m$-dimensional space $V_m$, associated with some feature map $\varphi$, using evaluations of the function at random points $x_1,\dots,x_n$. After recalling some results on optimal weighted least-squares using independent and identically distributed points, we consider weighted least-squares using projection determinantal point processes (DPP) or volume sampling. These distributions introduce dependence between the points that promotes diversity in the selected features $\varphi(x_i)$. We first provide a generalized version of volume-rescaled sampling yielding quasi-optimality results in expectation with a number of samples $n = O(m\log(m))$, that means that the expected $L^2$ error is bounded by a constant times the best approximation error in $L^2$. Also, further assuming that the function is in some normed vector space $H$ continuously embedded in $L^2$, we further prove that the approximation is almost surely bounded by the best approximation error measured in the $H$-norm. This includes the cases of functions from $L^\infty$ or reproducing kernel Hilbert spaces. Finally, we present an alternative strategy consisting in using independent repetitions of projection DPP (or volume sampling), yielding similar error bounds as with i.i.d. or volume sampling, but in practice with a much lower number of samples. Numerical experiments illustrate the performance of the different strategies.
Related papers
- On the query complexity of sampling from non-log-concave distributions [2.4253233571593547]
We study the problem of sampling from a $d$-dimensional distribution with density $p(x)propto e-f(x)$, which does not necessarily satisfy good isoperimetric conditions.
We show that for a wide range of parameters, sampling is strictly easier than optimization by a super-exponential factor in the dimension $d$.
arXiv Detail & Related papers (2025-02-10T06:54:16Z) - Entangled Mean Estimation in High-Dimensions [36.97113089188035]
We study the task of high-dimensional entangled mean estimation in the subset-of-signals model.
We show that the optimal error (up to polylogarithmic factors) is $f(alpha,N) + sqrtD/(alpha N)$, where the term $f(alpha,N)$ is the error of the one-dimensional problem and the second term is the sub-Gaussian error rate.
arXiv Detail & Related papers (2025-01-09T18:31:35Z) - Polynomial time sampling from log-smooth distributions in fixed dimension under semi-log-concavity of the forward diffusion with application to strongly dissipative distributions [9.48556659249574]
We provide a sampling algorithm with complexity in fixed dimension.
We prove that our algorithm achieves an expected $epsilon$ error in $KL$ divergence.
As an application, we derive an exponential complexity improvement for the problem of sampling from an $L$-log-smooth distribution.
arXiv Detail & Related papers (2024-12-31T17:51:39Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - Approximate Function Evaluation via Multi-Armed Bandits [51.146684847667125]
We study the problem of estimating the value of a known smooth function $f$ at an unknown point $boldsymbolmu in mathbbRn$, where each component $mu_i$ can be sampled via a noisy oracle.
We design an instance-adaptive algorithm that learns to sample according to the importance of each coordinate, and with probability at least $1-delta$ returns an $epsilon$ accurate estimate of $f(boldsymbolmu)$.
arXiv Detail & Related papers (2022-03-18T18:50:52Z) - An $L^2$ Analysis of Reinforcement Learning in High Dimensions with
Kernel and Neural Network Approximation [9.088303226909277]
This paper considers the situation where the function approximation is made using the kernel method or the two-layer neural network model.
We establish an $tildeO(H3|mathcal A|frac14n-frac14)$ bound for the optimal policy with $Hn$ samples.
Even though this result still requires a finite-sized action space, the error bound is independent of the dimensionality of the state space.
arXiv Detail & Related papers (2021-04-15T21:59:03Z) - Finding Global Minima via Kernel Approximations [90.42048080064849]
We consider the global minimization of smooth functions based solely on function evaluations.
In this paper, we consider an approach that jointly models the function to approximate and finds a global minimum.
arXiv Detail & Related papers (2020-12-22T12:59:30Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP)
We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z) - Maximizing Determinants under Matroid Constraints [69.25768526213689]
We study the problem of finding a basis $S$ of $M$ such that $det(sum_i in Sv_i v_i v_itop)$ is maximized.
This problem appears in a diverse set of areas such as experimental design, fair allocation of goods, network design, and machine learning.
arXiv Detail & Related papers (2020-04-16T19:16:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.