Finite-Sum Coupled Compositional Stochastic Optimization: Theory and
Applications
- URL: http://arxiv.org/abs/2202.12396v7
- Date: Mon, 12 Jun 2023 17:11:56 GMT
- Title: Finite-Sum Coupled Compositional Stochastic Optimization: Theory and
Applications
- Authors: Bokun Wang and Tianbao Yang
- Abstract summary: This paper provides a comprehensive analysis of a simple algorithm for both non- and convex objectives.
Our analysis also exhibits new insights for improving the practical implementation by sampling batches of equal size for the outer and inner levels.
- Score: 43.48388050033774
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper studies stochastic optimization for a sum of compositional
functions, where the inner-level function of each summand is coupled with the
corresponding summation index. We refer to this family of problems as
finite-sum coupled compositional optimization (FCCO). It has broad applications
in machine learning for optimizing non-convex or convex compositional
measures/objectives such as average precision (AP), p-norm push, listwise
ranking losses, neighborhood component analysis (NCA), deep survival analysis,
deep latent variable models, etc., which deserves finer analysis. Yet, existing
algorithms and analyses are restricted in one or other aspects. The
contribution of this paper is to provide a comprehensive convergence analysis
of a simple stochastic algorithm for both non-convex and convex objectives. Our
key result is the improved oracle complexity with the parallel speed-up by
using the moving-average based estimator with mini-batching. Our theoretical
analysis also exhibits new insights for improving the practical implementation
by sampling the batches of equal size for the outer and inner levels. Numerical
experiments on AP maximization, NCA, and p-norm push corroborate some aspects
of the theory.
Related papers
- Nonconvex Federated Learning on Compact Smooth Submanifolds With Heterogeneous Data [23.661713049508375]
We propose an algorithm that learns over a submanifold in the setting of a client.
We show that our proposed algorithm converges sub-ly to a neighborhood of a first-order optimal solution by using a novel analysis.
arXiv Detail & Related papers (2024-06-12T17:53:28Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - A theoretical and empirical study of new adaptive algorithms with
additional momentum steps and shifted updates for stochastic non-convex
optimization [0.0]
It is thought that adaptive optimization algorithms represent the key pillar behind the of the Learning field.
In this paper we introduce adaptive momentum techniques for different non-smooth objective problems.
arXiv Detail & Related papers (2021-10-16T09:47:57Z) - Efficient Methods for Structured Nonconvex-Nonconcave Min-Max
Optimization [98.0595480384208]
We propose a generalization extraient spaces which converges to a stationary point.
The algorithm applies not only to general $p$-normed spaces, but also to general $p$-dimensional vector spaces.
arXiv Detail & Related papers (2020-10-31T21:35:42Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z) - Convergence of Meta-Learning with Task-Specific Adaptation over Partial
Parameters [152.03852111442114]
Although model-agnostic metalearning (MAML) is a very successful algorithm meta-learning practice, it can have high computational complexity.
Our paper shows that such complexity can significantly affect the overall convergence performance of ANIL.
arXiv Detail & Related papers (2020-06-16T19:57:48Z) - Stochastic Recursive Variance Reduction for Efficient Smooth Non-Convex
Compositional Optimization [20.410012564838933]
compositional optimization arises in many important machine learning tasks such as value function evaluation in reinforcement learning and portfolio management.
In this paper, we investigate the general compositional optimization in the general smooth non-cursive setting.
arXiv Detail & Related papers (2019-12-31T18:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.