Robust Principal Component Analysis: A Median of Means Approach
- URL: http://arxiv.org/abs/2102.03403v2
- Date: Thu, 20 Jul 2023 05:58:30 GMT
- Title: Robust Principal Component Analysis: A Median of Means Approach
- Authors: Debolina Paul, Saptarshi Chakraborty and Swagatam Das
- Abstract summary: Principal Component Analysis is a tool for data visualization, denoising, and dimensionality reduction.
Recent supervised learning methods have shown great success in dealing with outlying observations.
This paper proposes a PCA procedure based on the MoM principle.
- Score: 17.446104539598895
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Principal Component Analysis (PCA) is a fundamental tool for data
visualization, denoising, and dimensionality reduction. It is widely popular in
Statistics, Machine Learning, Computer Vision, and related fields. However, PCA
is well-known to fall prey to outliers and often fails to detect the true
underlying low-dimensional structure within the dataset. Following the Median
of Means (MoM) philosophy, recent supervised learning methods have shown great
success in dealing with outlying observations without much compromise to their
large sample theoretical properties. This paper proposes a PCA procedure based
on the MoM principle. Called the \textbf{M}edian of \textbf{M}eans
\textbf{P}rincipal \textbf{C}omponent \textbf{A}nalysis (MoMPCA), the proposed
method is not only computationally appealing but also achieves optimal
convergence rates under minimal assumptions. In particular, we explore the
non-asymptotic error bounds of the obtained solution via the aid of the
Rademacher complexities while granting absolutely no assumption on the outlying
observations. The derived concentration results are not dependent on the
dimension because the analysis is conducted in a separable Hilbert space, and
the results only depend on the fourth moment of the underlying distribution in
the corresponding norm. The proposal's efficacy is also thoroughly showcased
through simulations and real data applications.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.