Closed-Form, Provable, and Robust PCA via Leverage Statistics and
Innovation Search
- URL: http://arxiv.org/abs/2106.12190v1
- Date: Wed, 23 Jun 2021 06:36:36 GMT
- Title: Closed-Form, Provable, and Robust PCA via Leverage Statistics and
Innovation Search
- Authors: Mostafa Rahmani and Ping Li
- Abstract summary: We study the Innovation Values computed by the Innovation Search algorithm under a quadratic cost function.
It is proved that Innovation Values with the new cost function are equivalent to Leverage Scores.
This interesting connection is utilized to establish several theoretical guarantees for a Leverage Score based robust PCA method.
- Score: 25.229137979402584
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The idea of Innovation Search, which was initially proposed for data
clustering, was recently used for outlier detection. In the application of
Innovation Search for outlier detection, the directions of innovation were
utilized to measure the innovation of the data points. We study the Innovation
Values computed by the Innovation Search algorithm under a quadratic cost
function and it is proved that Innovation Values with the new cost function are
equivalent to Leverage Scores. This interesting connection is utilized to
establish several theoretical guarantees for a Leverage Score based robust PCA
method and to design a new robust PCA method. The theoretical results include
performance guarantees with different models for the distribution of outliers
and the distribution of inliers. In addition, we demonstrate the robustness of
the algorithms against the presence of noise. The numerical and theoretical
studies indicate that while the presented approach is fast and closed-form, it
can outperform most of the existing algorithms.
Related papers
- A Data Balancing and Ensemble Learning Approach for Credit Card Fraud Detection [1.8921747725821432]
This research introduces an innovative method for identifying credit card fraud by combining the SMOTE-KMEANS technique with an ensemble machine learning model.
The proposed model was benchmarked against traditional models such as logistic regression, decision trees, random forests, and support vector machines.
Results demonstrated that the proposed model achieved superior performance, with an AUC of 0.96 when combined with the SMOTE-KMEANS algorithm.
arXiv Detail & Related papers (2025-03-27T04:59:45Z) - QuantFactor REINFORCE: Mining Steady Formulaic Alpha Factors with Variance-bounded REINFORCE [5.560011325936085]
The goal of alpha factor mining is to discover indicative signals of investment opportunities from the historical financial market data of assets.
Recently, a promising framework is proposed for generating formulaic alpha factors using deep reinforcement learning.
arXiv Detail & Related papers (2024-09-08T15:57:58Z) - Deep Uncertainty-Based Explore for Index Construction and Retrieval in Recommendation System [14.45319171869111]
Balancing the relevance and novelty of matching results is a crucial step in the design and optimization of recommendation systems.
This paper proposes the UICR (Uncertainty-based explore for Index Construction and Retrieval) algorithm.
It introduces the concept of uncertainty modeling in the matching stage and achieves multi-task modeling of model uncertainty and index uncertainty.
arXiv Detail & Related papers (2024-07-22T02:27:57Z) - Neural Active Learning Beyond Bandits [69.99592173038903]
We study both stream-based and pool-based active learning with neural network approximations.
We propose two algorithms based on the newly designed exploitation and exploration neural networks for stream-based and pool-based active learning.
arXiv Detail & Related papers (2024-04-18T21:52:14Z) - Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov Games [66.2085181793014]
We show that a model-free stage-based Q-learning algorithm can enjoy the same optimality in the $H$ dependence as model-based algorithms.
Our algorithm features a key novel design of updating the reference value functions as the pair of optimistic and pessimistic value functions.
arXiv Detail & Related papers (2023-08-17T08:34:58Z) - STEERING: Stein Information Directed Exploration for Model-Based
Reinforcement Learning [111.75423966239092]
We propose an exploration incentive in terms of the integral probability metric (IPM) between a current estimate of the transition model and the unknown optimal.
Based on KSD, we develop a novel algorithm algo: textbfSTEin information dirtextbfEcted exploration for model-based textbfReinforcement LearntextbfING.
arXiv Detail & Related papers (2023-01-28T00:49:28Z) - New Machine Learning Techniques for Simulation-Based Inference:
InferoStatic Nets, Kernel Score Estimation, and Kernel Likelihood Ratio
Estimation [4.415977307120616]
We propose a machine-learning approach to model the score and likelihood ratio estimators in cases when the probability density can be sampled but not computed directly.
We introduce new strategies, respectively called Kernel Score Estimation (KSE) and Kernel Likelihood Ratio Estimation (KLRE) to learn the score and the likelihood ratio functions from simulated data.
arXiv Detail & Related papers (2022-10-04T15:22:56Z) - Risk Minimization from Adaptively Collected Data: Guarantees for
Supervised and Policy Learning [57.88785630755165]
Empirical risk minimization (ERM) is the workhorse of machine learning, but its model-agnostic guarantees can fail when we use adaptively collected data.
We study a generic importance sampling weighted ERM algorithm for using adaptively collected data to minimize the average of a loss function over a hypothesis class.
For policy learning, we provide rate-optimal regret guarantees that close an open gap in the existing literature whenever exploration decays to zero.
arXiv Detail & Related papers (2021-06-03T09:50:13Z) - SAMBA: Safe Model-Based & Active Reinforcement Learning [59.01424351231993]
SAMBA is a framework for safe reinforcement learning that combines aspects from probabilistic modelling, information theory, and statistics.
We evaluate our algorithm on a variety of safe dynamical system benchmarks involving both low and high-dimensional state representations.
We provide intuition as to the effectiveness of the framework by a detailed analysis of our active metrics and safety constraints.
arXiv Detail & Related papers (2020-06-12T10:40:46Z) - Stochastic-Sign SGD for Federated Learning with Theoretical Guarantees [49.91477656517431]
Quantization-based solvers have been widely adopted in Federated Learning (FL)
No existing methods enjoy all the aforementioned properties.
We propose an intuitively-simple yet theoretically-simple method based on SIGNSGD to bridge the gap.
arXiv Detail & Related papers (2020-02-25T15:12:15Z) - Inferential Induction: A Novel Framework for Bayesian Reinforcement
Learning [6.16852156844376]
We describe a novel framework, Inferential Induction, for correctly inferring value function distributions from data.
We experimentally demonstrate that the proposed algorithm is competitive with respect to the state of the art.
arXiv Detail & Related papers (2020-02-08T06:19:15Z) - Outlier Detection and Data Clustering via Innovation Search [27.107601048639637]
We present a new discovery that the directions of innovation can be used to design a robust PCA method.
The proposed approach, dubbed iSearch, uses the direction search optimization problem to compute an optimal direction corresponding to each data point.
arXiv Detail & Related papers (2019-12-30T16:29:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.