Out-of-Sample Embedding with Proximity Data: Projection versus Restricted Reconstruction
- URL: http://arxiv.org/abs/2505.06756v1
- Date: Sat, 10 May 2025 21:11:30 GMT
- Title: Out-of-Sample Embedding with Proximity Data: Projection versus Restricted Reconstruction
- Authors: Michael W. Trosset, Kaiyi Tan, Minh Tang, Carey E. Priebe,
- Abstract summary: The problem of "adding a point to a vector diagram" was first studied by J.C. Gower in 1968.<n>Various circumstances may warrant either projection or restricted reconstruction.
- Score: 12.260736438853604
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The problem of using proximity (similarity or dissimilarity) data for the purpose of "adding a point to a vector diagram" was first studied by J.C. Gower in 1968. Since then, a number of methods -- mostly kernel methods -- have been proposed for solving what has come to be called the problem of *out-of-sample embedding*. We survey the various kernel methods that we have encountered and show that each can be derived from one or the other of two competing strategies: *projection* or *restricted reconstruction*. Projection can be analogized to a well-known formula for adding a point to a principal component analysis. Restricted reconstruction poses a different challenge: how to best approximate redoing the entire multivariate analysis while holding fixed the vector diagram that was previously obtained. This strategy results in a nonlinear optimization problem that can be simplified to a unidimensional search. Various circumstances may warrant either projection or restricted reconstruction.
Related papers
- The Differentiable Feasibility Pump [49.55771920271201]
This paper shows that the traditional feasibility pump and many of its follow-ups can be seen as gradient-descent algorithms with specific parameters.
A central aspect of this reinterpretation is observing that the traditional algorithm differentiates the solution of the linear relaxation with respect to its cost.
arXiv Detail & Related papers (2024-11-05T22:26:51Z) - A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval [56.67706781191521]
In this work, we present a robust phase retrieval problem where the task is to recover an unknown signal.
Our proposed oracle avoids the need for computationally spectral descent, using a simple gradient step and outliers.
arXiv Detail & Related papers (2024-09-07T06:37:23Z) - Disentangled Representation Learning with the Gromov-Monge Gap [65.73194652234848]
Learning disentangled representations from unlabelled data is a fundamental challenge in machine learning.
We introduce a novel approach to disentangled representation learning based on quadratic optimal transport.
We demonstrate the effectiveness of our approach for quantifying disentanglement across four standard benchmarks.
arXiv Detail & Related papers (2024-07-10T16:51:32Z) - Optimizing Solution-Samplers for Combinatorial Problems: The Landscape
of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods.
Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem.
As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - $\ell_1$-norm constrained multi-block sparse canonical correlation
analysis via proximal gradient descent [0.0]
We propose a proximal gradient descent algorithm to solve the multi-block CCA problem.
We show that the resulting estimate is rate-optimal under suitable assumptions.
We also describe an easy-to-implement deflation procedure to estimate multiple eigenvectors sequentially.
arXiv Detail & Related papers (2022-01-14T03:35:01Z) - Probabilistic methods for approximate archetypal analysis [8.829245587252435]
Archetypal analysis is an unsupervised learning method for exploratory data analysis.
We introduce two preprocessing techniques to reduce the dimension and representation cardinality of the data.
We demonstrate the usefulness of our results by applying our method to summarize several moderately large-scale datasets.
arXiv Detail & Related papers (2021-08-12T14:27:11Z) - Analysis of Truncated Orthogonal Iteration for Sparse Eigenvector
Problems [78.95866278697777]
We propose two variants of the Truncated Orthogonal Iteration to compute multiple leading eigenvectors with sparsity constraints simultaneously.
We then apply our algorithms to solve the sparse principle component analysis problem for a wide range of test datasets.
arXiv Detail & Related papers (2021-03-24T23:11:32Z) - New Methods for Detecting Concentric Objects With High Accuracy [0.0]
Fitting geometric objects to digitized data is an important problem in many areas such as iris detection, autonomous navigation, and industrial robotics operations.
There are two common approaches to fitting geometric shapes to data: the geometric (iterative) approach and algebraic (non-iterative) approach.
We develop new estimators, which can be used as reliable initial guesses for other iterative methods.
arXiv Detail & Related papers (2021-02-16T08:19:18Z) - Efficient Algorithms for Multidimensional Segmented Regression [42.046881924063044]
We study the fundamental problem of fixed design em multidimensional regression.
We provide the first sample and computationally efficient algorithm for this problem in any fixed dimension.
Our algorithm relies on a simple merging iterative approach, which is novel in the multidimensional setting.
arXiv Detail & Related papers (2020-03-24T19:39:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.