Stability-based Generalization Analysis of Randomized Coordinate Descent for Pairwise Learning
- URL: http://arxiv.org/abs/2503.01530v1
- Date: Mon, 03 Mar 2025 13:39:06 GMT
- Title: Stability-based Generalization Analysis of Randomized Coordinate Descent for Pairwise Learning
- Authors: Liang Wu, Ruixi Hu, Yunwen Lei,
- Abstract summary: We consider the generalization of RCD for pairwise learning.<n>We measure the on-average argument stability for both convex and strongly convex objective functions.
- Score: 30.557503207329965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pairwise learning includes various machine learning tasks, with ranking and metric learning serving as the primary representatives. While randomized coordinate descent (RCD) is popular in various learning problems, there is much less theoretical analysis on the generalization behavior of models trained by RCD, especially under the pairwise learning framework. In this paper, we consider the generalization of RCD for pairwise learning. We measure the on-average argument stability for both convex and strongly convex objective functions, based on which we develop generalization bounds in expectation. The early-stopping strategy is adopted to quantify the balance between estimation and optimization. Our analysis further incorporates the low-noise setting into the excess risk bound to achieve the optimistic bound as $O(1/n)$, where $n$ is the sample size.
Related papers
- Stochastic Q-learning for Large Discrete Action Spaces [79.1700188160944]
In complex environments with discrete action spaces, effective decision-making is critical in reinforcement learning (RL)
We present value-based RL approaches which, as opposed to optimizing over the entire set of $n$ actions, only consider a variable set of actions, possibly as small as $mathcalO(log(n)$)$.
The presented value-based RL methods include, among others, Q-learning, StochDQN, StochDDQN, all of which integrate this approach for both value-function updates and action selection.
arXiv Detail & Related papers (2024-05-16T17:58:44Z) - Stability and Generalization of the Decentralized Stochastic Gradient
Descent Ascent Algorithm [80.94861441583275]
We investigate the complexity of the generalization bound of the decentralized gradient descent (D-SGDA) algorithm.
Our results analyze the impact of different top factors on the generalization of D-SGDA.
We also balance it with the generalization to obtain the optimal convex-concave setting.
arXiv Detail & Related papers (2023-10-31T11:27:01Z) - Stability and Generalization of Stochastic Compositional Gradient
Descent Algorithms [61.59448949684493]
We provide the stability and generalization analysis of compositional descent algorithms built from training examples.
We establish the uniform stability results for two popular compositional gradient descent algorithms, namely SCGD and SCSC.
We derive-independent excess risk bounds for SCGD and SCSC by trade-offing their stability results and optimization errors.
arXiv Detail & Related papers (2023-07-07T02:40:09Z) - On the Stability and Generalization of Triplet Learning [55.75784102837832]
Triplet learning, i.e. learning from triplet data, has attracted much attention in computer vision tasks.
This paper investigates the generalization guarantees of triplet learning by leveraging the stability analysis.
arXiv Detail & Related papers (2023-02-20T07:32:50Z) - A Rationale-Centric Framework for Human-in-the-loop Machine Learning [12.793695970529138]
We present a novel rationale-centric framework with human-in-the-loop -- Rationales-centric Double-robustness Learning (RDL)
RDL exploits rationales (i.e. phrases that cause the prediction), human interventions and semi-factual augmentations to decouple spurious associations and bias models towards generally applicable underlying distributions.
arXiv Detail & Related papers (2022-03-24T08:12:57Z) - A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms.
We develop a general framework from the perspective of Bregman minimization divergence.
We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z) - Learning Rates for Nonconvex Pairwise Learning [7.33244617309908]
Pairwise learning is receiving increasing attention since it improve many important learning tasks based on the size of the population.
Motivated nonwise learning provides improved learning rates.
arXiv Detail & Related papers (2021-11-09T16:12:20Z) - Stability and Generalization for Randomized Coordinate Descent [19.687456295228156]
There is no work studying how the models trained by RCD would generalize to test examples.
In this paper, we initialize the generalization analysis of RCD by leveraging the powerful tool of algorithmic stability.
Our analysis shows that RCD enjoys better stability as compared to gradient descent.
arXiv Detail & Related papers (2021-08-17T02:52:50Z) - Improved Learning Rates for Stochastic Optimization: Two Theoretical
Viewpoints [7.33244617309908]
Generalization performance of optimization stands a central place in machine learning.
In this paper, we investigate the excess towards improved learning rates for two popular approaches of optimization.
Motivated by these problems, we aim to provide improved rates under milder assumptions in convex learning and derive faster rates non learning.
arXiv Detail & Related papers (2021-07-19T08:46:14Z) - Self-supervised Representation Learning with Relative Predictive Coding [102.93854542031396]
Relative Predictive Coding (RPC) is a new contrastive representation learning objective.
RPC maintains a good balance among training stability, minibatch size sensitivity, and downstream task performance.
We empirically verify the effectiveness of RPC on benchmark vision and speech self-supervised learning tasks.
arXiv Detail & Related papers (2021-03-21T01:04:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.