Related papers: Efficient Optimization Methods for Extreme Similarity Learning with Nonlinear Embeddings

Efficient Optimization Methods for Extreme Similarity Learning with Nonlinear Embeddings

URL: http://arxiv.org/abs/2010.13511v2
Date: Tue, 15 Jun 2021 11:53:13 GMT
Title: Efficient Optimization Methods for Extreme Similarity Learning with Nonlinear Embeddings
Authors: Bowen Yuan, Yu-Sheng Li, Pengrui Quan, Chih-Jen Lin
Abstract summary: We study the problem of learning similarity by using nonlinear embedding models from all possible pairs. This paper aims to extend results for general nonlinear embeddings.
Score: 12.119973339250848
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the problem of learning similarity by using nonlinear embedding models (e.g., neural networks) from all possible pairs. This problem is well-known for its difficulty of training with the extreme number of pairs. For the special case of using linear embeddings, many studies have addressed this issue of handling all pairs by considering certain loss functions and developing efficient optimization algorithms. This paper aims to extend results for general nonlinear embeddings. First, we finish detailed derivations and provide clean formulations for efficiently calculating some building blocks of optimization algorithms such as function, gradient evaluation, and Hessian-vector product. The result enables the use of many optimization methods for extreme similarity learning with nonlinear embeddings. Second, we study some optimization methods in detail. Due to the use of nonlinear embeddings, implementation issues different from linear cases are addressed. In the end, some methods are shown to be highly efficient for extreme similarity learning with nonlinear embeddings.

Related papers

WENDy for Nonlinear-in-Parameters ODEs [1.9573380763700712]
The Weak-form Estimation of Non-linear Dynamics (WEN) is extended to accommodate systems of ordinary differential equations that are nonlinear-ins. We present results on a suite of benchmark systems to demonstrate the practical benefits of our approach.
arXiv Detail & Related papers (2025-02-13T01:40:21Z)
Efficient Optimization Algorithms for Linear Adversarial Training [9.933836677441684]
Adversarial training can be used to learn models that are robust against perturbations. We propose tailored optimization algorithms for the adversarial training of linear models.
arXiv Detail & Related papers (2024-10-16T15:41:08Z)
Learning to optimize with convergence guarantees using nonlinear system theory [0.4143603294943439]
We propose an unconstrained parametrization of algorithms for smooth objective functions. Notably, our framework is directly compatible with automatic differentiation tools.
arXiv Detail & Related papers (2024-03-14T13:40:26Z)
Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets. We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z)
Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances. Online descent (OGD) is a popular approach to handle streaming data in pairwise learning. In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z)
Divide and Learn: A Divide and Conquer Approach for Predict+Optimize [50.03608569227359]
The predict+optimize problem combines machine learning ofproblem coefficients with a optimization prob-lem that uses the predicted coefficients. We show how to directlyexpress the loss of the optimization problem in terms of thepredicted coefficients as a piece-wise linear function. We propose a novel divide and algorithm to tackle optimization problems without this restriction and predict itscoefficients using the optimization loss.
arXiv Detail & Related papers (2020-12-04T00:26:56Z)
Kernel methods through the roof: handling billions of points efficiently [94.31450736250918]
Kernel methods provide an elegant and principled approach to nonparametric learning, but so far could hardly be used in large scale problems. Recent advances have shown the benefits of a number of algorithmic ideas, for example combining optimization, numerical linear algebra and random projections. Here, we push these efforts further to develop and test a solver that takes full advantage of GPU hardware.
arXiv Detail & Related papers (2020-06-18T08:16:25Z)
Learning to Guide Random Search [111.71167792453473]
We consider derivative-free optimization of a high-dimensional function that lies on a latent low-dimensional manifold. We develop an online learning approach that learns this manifold while performing the optimization. We empirically evaluate the method on continuous optimization benchmarks and high-dimensional continuous control problems.
arXiv Detail & Related papers (2020-04-25T19:21:14Z)
Learning with Differentiable Perturbed Optimizers [54.351317101356614]
We propose a systematic method to transform operations into operations that are differentiable and never locally constant. Our approach relies on perturbeds, and can be used readily together with existing solvers. We show how this framework can be connected to a family of losses developed in structured prediction, and give theoretical guarantees for their use in learning tasks.
arXiv Detail & Related papers (2020-02-20T11:11:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.