Related papers: Closed-Form, Provable, and Robust PCA via Leverage Statistics and Innovation Search

Closed-Form, Provable, and Robust PCA via Leverage Statistics and Innovation Search

URL: http://arxiv.org/abs/2106.12190v1
Date: Wed, 23 Jun 2021 06:36:36 GMT
Title: Closed-Form, Provable, and Robust PCA via Leverage Statistics and Innovation Search
Authors: Mostafa Rahmani and Ping Li
Abstract summary: We study the Innovation Values computed by the Innovation Search algorithm under a quadratic cost function. It is proved that Innovation Values with the new cost function are equivalent to Leverage Scores. This interesting connection is utilized to establish several theoretical guarantees for a Leverage Score based robust PCA method.
Score: 25.229137979402584
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The idea of Innovation Search, which was initially proposed for data clustering, was recently used for outlier detection. In the application of Innovation Search for outlier detection, the directions of innovation were utilized to measure the innovation of the data points. We study the Innovation Values computed by the Innovation Search algorithm under a quadratic cost function and it is proved that Innovation Values with the new cost function are equivalent to Leverage Scores. This interesting connection is utilized to establish several theoretical guarantees for a Leverage Score based robust PCA method and to design a new robust PCA method. The theoretical results include performance guarantees with different models for the distribution of outliers and the distribution of inliers. In addition, we demonstrate the robustness of the algorithms against the presence of noise. The numerical and theoretical studies indicate that while the presented approach is fast and closed-form, it can outperform most of the existing algorithms.

Related papers

A Data Balancing and Ensemble Learning Approach for Credit Card Fraud Detection [1.8921747725821432]
This research introduces an innovative method for identifying credit card fraud by combining the SMOTE-KMEANS technique with an ensemble machine learning model. The proposed model was benchmarked against traditional models such as logistic regression, decision trees, random forests, and support vector machines. Results demonstrated that the proposed model achieved superior performance, with an AUC of 0.96 when combined with the SMOTE-KMEANS algorithm.
arXiv Detail & Related papers (2025-03-27T04:59:45Z)
QuantFactor REINFORCE: Mining Steady Formulaic Alpha Factors with Variance-bounded REINFORCE [5.560011325936085]
The goal of alpha factor mining is to discover indicative signals of investment opportunities from the historical financial market data of assets. Recently, a promising framework is proposed for generating formulaic alpha factors using deep reinforcement learning.
arXiv Detail & Related papers (2024-09-08T15:57:58Z)
Deep Uncertainty-Based Explore for Index Construction and Retrieval in Recommendation System [14.45319171869111]
Balancing the relevance and novelty of matching results is a crucial step in the design and optimization of recommendation systems. This paper proposes the UICR (Uncertainty-based explore for Index Construction and Retrieval) algorithm. It introduces the concept of uncertainty modeling in the matching stage and achieves multi-task modeling of model uncertainty and index uncertainty.
arXiv Detail & Related papers (2024-07-22T02:27:57Z)
Neural Active Learning Beyond Bandits [69.99592173038903]
We study both stream-based and pool-based active learning with neural network approximations. We propose two algorithms based on the newly designed exploitation and exploration neural networks for stream-based and pool-based active learning.
arXiv Detail & Related papers (2024-04-18T21:52:14Z)
Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov Games [66.2085181793014]
We show that a model-free stage-based Q-learning algorithm can enjoy the same optimality in the $H$ dependence as model-based algorithms. Our algorithm features a key novel design of updating the reference value functions as the pair of optimistic and pessimistic value functions.
arXiv Detail & Related papers (2023-08-17T08:34:58Z)
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning [111.75423966239092]
We propose an exploration incentive in terms of the integral probability metric (IPM) between a current estimate of the transition model and the unknown optimal. Based on KSD, we develop a novel algorithm algo: textbfSTEin information dirtextbfEcted exploration for model-based textbfReinforcement LearntextbfING.
arXiv Detail & Related papers (2023-01-28T00:49:28Z)
New Machine Learning Techniques for Simulation-Based Inference: InferoStatic Nets, Kernel Score Estimation, and Kernel Likelihood Ratio Estimation [4.415977307120616]
We propose a machine-learning approach to model the score and likelihood ratio estimators in cases when the probability density can be sampled but not computed directly. We introduce new strategies, respectively called Kernel Score Estimation (KSE) and Kernel Likelihood Ratio Estimation (KLRE) to learn the score and the likelihood ratio functions from simulated data.
arXiv Detail & Related papers (2022-10-04T15:22:56Z)
Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning [57.88785630755165]
Empirical risk minimization (ERM) is the workhorse of machine learning, but its model-agnostic guarantees can fail when we use adaptively collected data. We study a generic importance sampling weighted ERM algorithm for using adaptively collected data to minimize the average of a loss function over a hypothesis class. For policy learning, we provide rate-optimal regret guarantees that close an open gap in the existing literature whenever exploration decays to zero.
arXiv Detail & Related papers (2021-06-03T09:50:13Z)
SAMBA: Safe Model-Based & Active Reinforcement Learning [59.01424351231993]
SAMBA is a framework for safe reinforcement learning that combines aspects from probabilistic modelling, information theory, and statistics. We evaluate our algorithm on a variety of safe dynamical system benchmarks involving both low and high-dimensional state representations. We provide intuition as to the effectiveness of the framework by a detailed analysis of our active metrics and safety constraints.
arXiv Detail & Related papers (2020-06-12T10:40:46Z)
Stochastic-Sign SGD for Federated Learning with Theoretical Guarantees [49.91477656517431]
Quantization-based solvers have been widely adopted in Federated Learning (FL) No existing methods enjoy all the aforementioned properties. We propose an intuitively-simple yet theoretically-simple method based on SIGNSGD to bridge the gap.
arXiv Detail & Related papers (2020-02-25T15:12:15Z)
Inferential Induction: A Novel Framework for Bayesian Reinforcement Learning [6.16852156844376]
We describe a novel framework, Inferential Induction, for correctly inferring value function distributions from data. We experimentally demonstrate that the proposed algorithm is competitive with respect to the state of the art.
arXiv Detail & Related papers (2020-02-08T06:19:15Z)
Outlier Detection and Data Clustering via Innovation Search [27.107601048639637]
We present a new discovery that the directions of innovation can be used to design a robust PCA method. The proposed approach, dubbed iSearch, uses the direction search optimization problem to compute an optimal direction corresponding to each data point.
arXiv Detail & Related papers (2019-12-30T16:29:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.