Efficient Optimization Methods for Extreme Similarity Learning with
Nonlinear Embeddings
- URL: http://arxiv.org/abs/2010.13511v2
- Date: Tue, 15 Jun 2021 11:53:13 GMT
- Title: Efficient Optimization Methods for Extreme Similarity Learning with
Nonlinear Embeddings
- Authors: Bowen Yuan, Yu-Sheng Li, Pengrui Quan, Chih-Jen Lin
- Abstract summary: We study the problem of learning similarity by using nonlinear embedding models from all possible pairs.
This paper aims to extend results for general nonlinear embeddings.
- Score: 12.119973339250848
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of learning similarity by using nonlinear embedding
models (e.g., neural networks) from all possible pairs. This problem is
well-known for its difficulty of training with the extreme number of pairs. For
the special case of using linear embeddings, many studies have addressed this
issue of handling all pairs by considering certain loss functions and
developing efficient optimization algorithms. This paper aims to extend results
for general nonlinear embeddings. First, we finish detailed derivations and
provide clean formulations for efficiently calculating some building blocks of
optimization algorithms such as function, gradient evaluation, and
Hessian-vector product. The result enables the use of many optimization methods
for extreme similarity learning with nonlinear embeddings. Second, we study
some optimization methods in detail. Due to the use of nonlinear embeddings,
implementation issues different from linear cases are addressed. In the end,
some methods are shown to be highly efficient for extreme similarity learning
with nonlinear embeddings.
Related papers
- Efficient Optimization Algorithms for Linear Adversarial Training [9.933836677441684]
Adversarial training can be used to learn models that are robust against perturbations.
We propose tailored optimization algorithms for the adversarial training of linear models.
arXiv Detail & Related papers (2024-10-16T15:41:08Z) - Learning to optimize with convergence guarantees using nonlinear system theory [0.4143603294943439]
We propose an unconstrained parametrization of algorithms for smooth objective functions.
Notably, our framework is directly compatible with automatic differentiation tools.
arXiv Detail & Related papers (2024-03-14T13:40:26Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise
Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances.
Online descent (OGD) is a popular approach to handle streaming data in pairwise learning.
In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z) - Divide and Learn: A Divide and Conquer Approach for Predict+Optimize [50.03608569227359]
The predict+optimize problem combines machine learning ofproblem coefficients with a optimization prob-lem that uses the predicted coefficients.
We show how to directlyexpress the loss of the optimization problem in terms of thepredicted coefficients as a piece-wise linear function.
We propose a novel divide and algorithm to tackle optimization problems without this restriction and predict itscoefficients using the optimization loss.
arXiv Detail & Related papers (2020-12-04T00:26:56Z) - Kernel methods through the roof: handling billions of points efficiently [94.31450736250918]
Kernel methods provide an elegant and principled approach to nonparametric learning, but so far could hardly be used in large scale problems.
Recent advances have shown the benefits of a number of algorithmic ideas, for example combining optimization, numerical linear algebra and random projections.
Here, we push these efforts further to develop and test a solver that takes full advantage of GPU hardware.
arXiv Detail & Related papers (2020-06-18T08:16:25Z) - Learning to Guide Random Search [111.71167792453473]
We consider derivative-free optimization of a high-dimensional function that lies on a latent low-dimensional manifold.
We develop an online learning approach that learns this manifold while performing the optimization.
We empirically evaluate the method on continuous optimization benchmarks and high-dimensional continuous control problems.
arXiv Detail & Related papers (2020-04-25T19:21:14Z) - Learning with Differentiable Perturbed Optimizers [54.351317101356614]
We propose a systematic method to transform operations into operations that are differentiable and never locally constant.
Our approach relies on perturbeds, and can be used readily together with existing solvers.
We show how this framework can be connected to a family of losses developed in structured prediction, and give theoretical guarantees for their use in learning tasks.
arXiv Detail & Related papers (2020-02-20T11:11:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.