Outlier detection in non-elliptical data by kernel MRCD
- URL: http://arxiv.org/abs/2008.02046v2
- Date: Mon, 29 Mar 2021 22:23:14 GMT
- Title: Outlier detection in non-elliptical data by kernel MRCD
- Authors: Joachim Schreurs, Iwein Vranckx, Mia Hubert, Johan A.K. Suykens, Peter
J. Rousseeuw
- Abstract summary: The Kernel Minimum Regularized Covariance Determinant (KMRCD) estimator is proposed.
It implicitly computes the MRCD estimates in a kernel induced feature space.
A fast algorithm is constructed that starts from kernel-based initial estimates and exploits the kernel trick to speed up the subsequent computations.
- Score: 10.69910379275607
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The minimum regularized covariance determinant method (MRCD) is a robust
estimator for multivariate location and scatter, which detects outliers by
fitting a robust covariance matrix to the data. Its regularization ensures that
the covariance matrix is well-conditioned in any dimension. The MRCD assumes
that the non-outlying observations are roughly elliptically distributed, but
many datasets are not of that form. Moreover, the computation time of MRCD
increases substantially when the number of variables goes up, and nowadays
datasets with many variables are common. The proposed Kernel Minimum
Regularized Covariance Determinant (KMRCD) estimator addresses both issues. It
is not restricted to elliptical data because it implicitly computes the MRCD
estimates in a kernel induced feature space. A fast algorithm is constructed
that starts from kernel-based initial estimates and exploits the kernel trick
to speed up the subsequent computations. Based on the KMRCD estimates, a rule
is proposed to flag outliers. The KMRCD algorithm performs well in simulations,
and is illustrated on real-life data.
Related papers
- Universality of kernel random matrices and kernel regression in the quadratic regime [18.51014786894174]
In this work, we extend the study of kernel kernel regression to the quadratic regime.
We establish an operator norm approximation bound for the difference between the original kernel random matrix and a quadratic kernel random matrix.
We characterize the precise training and generalization errors for KRR in the quadratic regime when $n/d2$ converges to a nonzero constant.
arXiv Detail & Related papers (2024-08-02T07:29:49Z) - Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation.
Specifically, we construct distance matrix between data points by Butterworth filter.
To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z) - Kernel distance measures for time series, random fields and other
structured data [71.61147615789537]
kdiff is a novel kernel-based measure for estimating distances between instances of structured data.
It accounts for both self and cross similarities across the instances and is defined using a lower quantile of the distance distribution.
Some theoretical results are provided for separability conditions using kdiff as a distance measure for clustering and classification problems.
arXiv Detail & Related papers (2021-09-29T22:54:17Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - SreaMRAK a Streaming Multi-Resolution Adaptive Kernel Algorithm [60.61943386819384]
Existing implementations of KRR require that all the data is stored in the main memory.
We propose StreaMRAK - a streaming version of KRR.
We present a showcase study on two synthetic problems and the prediction of the trajectory of a double pendulum.
arXiv Detail & Related papers (2021-08-23T21:03:09Z) - Kernel Mean Estimation by Marginalized Corrupted Distributions [96.9272743070371]
Estimating the kernel mean in a kernel Hilbert space is a critical component in many kernel learning algorithms.
We present a new kernel mean estimator, called the marginalized kernel mean estimator, which estimates kernel mean under the corrupted distribution.
arXiv Detail & Related papers (2021-07-10T15:11:28Z) - Faster Kernel Interpolation for Gaussian Processes [30.04235162264955]
Key challenge in scaling Process (GP) regression to massive datasets is that exact inference requires a dense n x n kernel matrix.
Structured kernel (SKI) is among the most scalable methods.
We show that SKI can be reduced to O(m log m) after a single O(n) time precomputation step.
We demonstrate speedups in practice for a wide range of m and n and apply the method to GP inference on a three-dimensional weather radar dataset with over 100 million points.
arXiv Detail & Related papers (2021-01-28T00:09:22Z) - Kernel Alignment Risk Estimator: Risk Prediction from Training Data [10.739602293023058]
We study the risk (i.e. generalization error) of Kernel Ridge Regression (KRR) for a kernel $K$ with ridge $lambda>0$ and i.i.i.d. observations.
We introduce two objects: the Signal Capture Threshold (SCT) and the Kernel Alignment Risk Estimator (KARE)
arXiv Detail & Related papers (2020-06-17T12:00:05Z) - Variance reduction for Random Coordinate Descent-Langevin Monte Carlo [7.464874233755718]
Langevin Monte Carlo (LMC) that provides fast convergence requires computation of gradient approximations.
In practice one uses finite-differencing approximations as surrogates, and the method is expensive in high-dimensions.
We introduce a new variance reduction approach, termed Coordinates Averaging Descent (RCAD), and incorporate it with both overdamped and underdamped LMC.
arXiv Detail & Related papers (2020-06-10T21:08:38Z) - Improved guarantees and a multiple-descent curve for Column Subset
Selection and the Nystr\"om method [76.73096213472897]
We develop techniques which exploit spectral properties of the data matrix to obtain improved approximation guarantees.
Our approach leads to significantly better bounds for datasets with known rates of singular value decay.
We show that both our improved bounds and the multiple-descent curve can be observed on real datasets simply by varying the RBF parameter.
arXiv Detail & Related papers (2020-02-21T00:43:06Z) - Kernel Selection for Modal Linear Regression: Optimal Kernel and IRLS
Algorithm [8.571896191090744]
We show that a Biweight kernel is optimal in the sense of minimizing an mean squared error of a resulting MLR parameter.
Secondly, we provide a kernel class for which algorithm iteratively reweighted least-squares algorithm (IRLS) is guaranteed to converge.
arXiv Detail & Related papers (2020-01-30T03:57:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.