Understanding the Generalization Performance of Spectral Clustering
Algorithms
- URL: http://arxiv.org/abs/2205.00281v1
- Date: Sat, 30 Apr 2022 14:21:56 GMT
- Title: Understanding the Generalization Performance of Spectral Clustering
Algorithms
- Authors: Shaojie Li, Sheng Ouyang and Yong Liu
- Abstract summary: We study the excess risk bounds of the popular spectral clustering algorithms: emphrelaxed RatioCut and emphrelaxed NCut.
We propose two novel algorithms that can not only penalize this quantity, but also cluster the out-of-sample data without re-eigendecomposition on the overall sample.
- Score: 11.025579607812167
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The theoretical analysis of spectral clustering mainly focuses on
consistency, while there is relatively little research on its generalization
performance. In this paper, we study the excess risk bounds of the popular
spectral clustering algorithms: \emph{relaxed} RatioCut and \emph{relaxed}
NCut. Firstly, we show that their excess risk bounds between the empirical
continuous optimal solution and the population-level continuous optimal
solution have a $\mathcal{O}(1/\sqrt{n})$ convergence rate, where $n$ is the
sample size. Secondly, we show the fundamental quantity in influencing the
excess risk between the empirical discrete optimal solution and the
population-level discrete optimal solution. At the empirical level, algorithms
can be designed to reduce this quantity. Based on our theoretical analysis, we
propose two novel algorithms that can not only penalize this quantity, but also
cluster the out-of-sample data without re-eigendecomposition on the overall
sample. Experiments verify the effectiveness of the proposed algorithms.
Related papers
- SOREL: A Stochastic Algorithm for Spectral Risks Minimization [1.6574413179773761]
spectral risk has wide applications in machine learning, especially in real-world decision-making.
By assigning different weights to the losses of different sample points, it allows the model's performance to lie between the average performance and the worst-case performance.
We propose SOREL, the first gradient-based algorithm with convergence guarantees for the spectral risk minimization.
arXiv Detail & Related papers (2024-07-19T18:20:53Z) - Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z) - Discretize Relaxed Solution of Spectral Clustering via a Non-Heuristic
Algorithm [77.53604156112144]
We develop a first-order term to bridge the original problem and discretization algorithm.
Since the non-heuristic method is aware of the original graph cut problem, the final discrete solution is more reliable.
arXiv Detail & Related papers (2023-10-19T13:57:38Z) - Best Subset Selection in Reduced Rank Regression [1.4699455652461724]
We show that our algorithm can achieve the reduced rank estimation with a significant probability.
The numerical studies and an application in the cancer studies demonstrate effectiveness and scalability.
arXiv Detail & Related papers (2022-11-29T02:51:15Z) - Quantization-Based Optimization: Alternative Stochastic Approximation of
Global Optimization [0.0]
We propose a global optimization algorithm based on quantizing the energy level of an objective function in an NP-hard problem.
Numerical experiments show that the proposed algorithm outperforms conventional learning methods in solving NP-hard optimization problems.
arXiv Detail & Related papers (2022-11-08T03:01:45Z) - Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
In this work, we present the first trial in the single-dependent generalization of AUPRC optimization.
Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z) - First-Order Algorithms for Nonlinear Generalized Nash Equilibrium
Problems [88.58409977434269]
We consider the problem of computing an equilibrium in a class of nonlinear generalized Nash equilibrium problems (NGNEPs)
Our contribution is to provide two simple first-order algorithmic frameworks based on the quadratic penalty method and the augmented Lagrangian method.
We provide nonasymptotic theoretical guarantees for these algorithms.
arXiv Detail & Related papers (2022-04-07T00:11:05Z) - Amortized Implicit Differentiation for Stochastic Bilevel Optimization [53.12363770169761]
We study a class of algorithms for solving bilevel optimization problems in both deterministic and deterministic settings.
We exploit a warm-start strategy to amortize the estimation of the exact gradient.
By using this framework, our analysis shows these algorithms to match the computational complexity of methods that have access to an unbiased estimate of the gradient.
arXiv Detail & Related papers (2021-11-29T15:10:09Z) - Sharp global convergence guarantees for iterative nonconvex
optimization: A Gaussian process perspective [30.524043513721168]
We develop a general recipe for analyzing the convergence of iterative algorithms for a class of regression models.
deterministicly, we accurately capture both the convergence rate of the algorithm and the eventual error floor in the finite-sample regime.
We show sharp convergence rates for both higher-order algorithms based on alternating updates and first-order algorithms based on subgradient subgradients.
arXiv Detail & Related papers (2021-09-20T21:48:19Z) - Towards Optimal Problem Dependent Generalization Error Bounds in
Statistical Learning Theory [11.840747467007963]
We study problem-dependent rates that scale near-optimally with the variance, the effective loss errors, or the norms evaluated at the "best gradient hypothesis"
We introduce a principled framework dubbed "uniform localized convergence"
We show that our framework resolves several fundamental limitations of existing uniform convergence and localization analysis approaches.
arXiv Detail & Related papers (2020-11-12T04:07:29Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.