On the Stability and Generalization of Triplet Learning
- URL: http://arxiv.org/abs/2302.09815v1
- Date: Mon, 20 Feb 2023 07:32:50 GMT
- Title: On the Stability and Generalization of Triplet Learning
- Authors: Jun Chen, Hong Chen, Xue Jiang, Bin Gu, Weifu Li, Tieliang Gong, Feng
Zheng
- Abstract summary: Triplet learning, i.e. learning from triplet data, has attracted much attention in computer vision tasks.
This paper investigates the generalization guarantees of triplet learning by leveraging the stability analysis.
- Score: 55.75784102837832
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Triplet learning, i.e. learning from triplet data, has attracted much
attention in computer vision tasks with an extremely large number of
categories, e.g., face recognition and person re-identification. Albeit with
rapid progress in designing and applying triplet learning algorithms, there is
a lacking study on the theoretical understanding of their generalization
performance. To fill this gap, this paper investigates the generalization
guarantees of triplet learning by leveraging the stability analysis.
Specifically, we establish the first general high-probability generalization
bound for the triplet learning algorithm satisfying the uniform stability, and
then obtain the excess risk bounds of the order $O(n^{-\frac{1}{2}}
\mathrm{log}n)$ for both stochastic gradient descent (SGD) and regularized risk
minimization (RRM), where $2n$ is approximately equal to the number of training
samples. Moreover, an optimistic generalization bound in expectation as fast as
$O(n^{-1})$ is derived for RRM in a low noise case via the on-average stability
analysis. Finally, our results are applied to triplet metric learning to
characterize its theoretical underpinning.
Related papers
- Tri-Level Navigator: LLM-Empowered Tri-Level Learning for Time Series OOD Generalization [9.95894026392039]
We investigate time series OOD generalization via pre-trained Large Language Models.
We first propose a novel textbfTri-level learning framework for textbfSeries textbfOOD generalization, termed TTSO.
We develop a stratified localization algorithm tailored for this tri-level optimization problem, theoretically demonstrating the guaranteed convergence of the proposed algorithm.
arXiv Detail & Related papers (2024-10-09T16:00:21Z) - Stability and Generalization for Stochastic Recursive Momentum-based Algorithms for (Strongly-)Convex One to $K$-Level Stochastic Optimizations [20.809499420384256]
STORM-based algorithms have been widely developed to solve one to $K$-level ($K geq 3$) optimization problems.
This paper provides a comprehensive analysis of three representative STORM-based algorithms.
arXiv Detail & Related papers (2024-07-07T07:07:04Z) - Towards Understanding the Generalizability of Delayed Stochastic
Gradient Descent [63.43247232708004]
A gradient descent performed in an asynchronous manner plays a crucial role in training large-scale machine learning models.
Existing generalization error bounds are rather pessimistic and cannot reveal the correlation between asynchronous delays and generalization.
Our theoretical results indicate that asynchronous delays reduce the generalization error of the delayed SGD algorithm.
arXiv Detail & Related papers (2023-08-18T10:00:27Z) - Stability and Generalization of Stochastic Compositional Gradient
Descent Algorithms [61.59448949684493]
We provide the stability and generalization analysis of compositional descent algorithms built from training examples.
We establish the uniform stability results for two popular compositional gradient descent algorithms, namely SCGD and SCSC.
We derive-independent excess risk bounds for SCGD and SCSC by trade-offing their stability results and optimization errors.
arXiv Detail & Related papers (2023-07-07T02:40:09Z) - Understanding Generalization of Federated Learning via Stability:
Heterogeneity Matters [1.4502611532302039]
Generalization performance is a key metric in evaluating machine learning models when applied to real-world applications.
Generalization performance is a key metric in evaluating machine learning models when applied to real-world applications.
arXiv Detail & Related papers (2023-06-06T16:12:35Z) - Generalization Analysis for Contrastive Representation Learning [80.89690821916653]
Existing generalization error bounds depend linearly on the number $k$ of negative examples.
We establish novel generalization bounds for contrastive learning which do not depend on $k$, up to logarithmic terms.
arXiv Detail & Related papers (2023-02-24T01:03:56Z) - Benign Underfitting of Stochastic Gradient Descent [72.38051710389732]
We study to what extent may gradient descent (SGD) be understood as a "conventional" learning rule that achieves generalization performance by obtaining a good fit training data.
We analyze the closely related with-replacement SGD, for which an analogous phenomenon does not occur and prove that its population risk does in fact converge at the optimal rate.
arXiv Detail & Related papers (2022-02-27T13:25:01Z) - Learning Rates for Nonconvex Pairwise Learning [7.33244617309908]
Pairwise learning is receiving increasing attention since it improve many important learning tasks based on the size of the population.
Motivated nonwise learning provides improved learning rates.
arXiv Detail & Related papers (2021-11-09T16:12:20Z) - Learning with Multiclass AUC: Theory and Algorithms [141.63211412386283]
Area under the ROC curve (AUC) is a well-known ranking metric for problems such as imbalanced learning and recommender systems.
In this paper, we start an early trial to consider the problem of learning multiclass scoring functions via optimizing multiclass AUC metrics.
arXiv Detail & Related papers (2021-07-28T05:18:10Z) - Improved Learning Rates for Stochastic Optimization: Two Theoretical
Viewpoints [7.33244617309908]
Generalization performance of optimization stands a central place in machine learning.
In this paper, we investigate the excess towards improved learning rates for two popular approaches of optimization.
Motivated by these problems, we aim to provide improved rates under milder assumptions in convex learning and derive faster rates non learning.
arXiv Detail & Related papers (2021-07-19T08:46:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.