Probabilistic methods for approximate archetypal analysis
- URL: http://arxiv.org/abs/2108.05767v2
- Date: Mon, 16 Aug 2021 14:15:28 GMT
- Title: Probabilistic methods for approximate archetypal analysis
- Authors: Ruijian Han, Braxton Osting, Dong Wang, Yiming Xu
- Abstract summary: Archetypal analysis is an unsupervised learning method for exploratory data analysis.
We introduce two preprocessing techniques to reduce the dimension and representation cardinality of the data.
We demonstrate the usefulness of our results by applying our method to summarize several moderately large-scale datasets.
- Score: 8.829245587252435
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Archetypal analysis is an unsupervised learning method for exploratory data
analysis. One major challenge that limits the applicability of archetypal
analysis in practice is the inherent computational complexity of the existing
algorithms. In this paper, we provide a novel approximation approach to
partially address this issue. Utilizing probabilistic ideas from
high-dimensional geometry, we introduce two preprocessing techniques to reduce
the dimension and representation cardinality of the data, respectively. We
prove that, provided the data is approximately embedded in a low-dimensional
linear subspace and the convex hull of the corresponding representations is
well approximated by a polytope with a few vertices, our method can effectively
reduce the scaling of archetypal analysis. Moreover, the solution of the
reduced problem is near-optimal in terms of prediction errors. Our approach can
be combined with other acceleration techniques to further mitigate the
intrinsic complexity of archetypal analysis. We demonstrate the usefulness of
our results by applying our method to summarize several moderately large-scale
datasets.
Related papers
- Efficient Fairness-Performance Pareto Front Computation [51.558848491038916]
We show that optimal fair representations possess several useful structural properties.
We then show that these approxing problems can be solved efficiently via concave programming methods.
arXiv Detail & Related papers (2024-09-26T08:46:48Z) - Mixed Matrix Completion in Complex Survey Sampling under Heterogeneous
Missingness [6.278498348219109]
We propose a fast and scalable estimation algorithm that achieves sublinear convergence.
The proposed method is applied to analyze the National Health and Nutrition Examination Survey data.
arXiv Detail & Related papers (2024-02-06T12:26:58Z) - Optimizing Solution-Samplers for Combinatorial Problems: The Landscape
of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods.
Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem.
As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - Entropic Descent Archetypal Analysis for Blind Hyperspectral Unmixing [45.82374977939355]
We introduce a new algorithm based on archetypal analysis for blind hyperspectral unmixing.
By using six standard real datasets, we show that our approach outperforms state-of-the-art matrix factorization and recent deep learning methods.
arXiv Detail & Related papers (2022-09-22T13:34:21Z) - On Riemannian Approach for Constrained Optimization Model in Extreme
Classification Problems [2.7436792484073638]
A constrained optimization problem is formulated as an optimization problem on matrix manifold.
The proposed approach is tested on several real world large scale multi-label datasets.
arXiv Detail & Related papers (2021-09-30T11:28:35Z) - Analysis of Truncated Orthogonal Iteration for Sparse Eigenvector
Problems [78.95866278697777]
We propose two variants of the Truncated Orthogonal Iteration to compute multiple leading eigenvectors with sparsity constraints simultaneously.
We then apply our algorithms to solve the sparse principle component analysis problem for a wide range of test datasets.
arXiv Detail & Related papers (2021-03-24T23:11:32Z) - A Forward Backward Greedy approach for Sparse Multiscale Learning [0.0]
We propose a feature driven Reproducing Kernel Hilbert space (RKHS) for which the associated kernel has a weighted multiscale structure.
For generating approximations in this space, we provide a practical forward-backward algorithm that is shown to greedily construct a set of basis functions having a multiscale structure.
We analyze the performance of the approach on a variety of simulation and real data sets.
arXiv Detail & Related papers (2021-02-14T04:22:52Z) - Optimal oracle inequalities for solving projected fixed-point equations [53.31620399640334]
We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space.
We show how our results precisely characterize the error of a class of temporal difference learning methods for the policy evaluation problem with linear function approximation.
arXiv Detail & Related papers (2020-12-09T20:19:32Z) - Convex Geometry and Duality of Over-parameterized Neural Networks [70.15611146583068]
We develop a convex analytic approach to analyze finite width two-layer ReLU networks.
We show that an optimal solution to the regularized training problem can be characterized as extreme points of a convex set.
In higher dimensions, we show that the training problem can be cast as a finite dimensional convex problem with infinitely many constraints.
arXiv Detail & Related papers (2020-02-25T23:05:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.