Principal Component Analysis in Space Forms
- URL: http://arxiv.org/abs/2301.02750v2
- Date: Tue, 9 Jul 2024 19:09:28 GMT
- Title: Principal Component Analysis in Space Forms
- Authors: Puoya Tabaghi, Michael Khanzadeh, Yusu Wang, Sivash Mirarab,
- Abstract summary: We study Principal Component Analysis (PCA) in space forms.
Finding the optimal low-dimensional affine subspace for given points in a space form amounts to dimensionality reduction.
We propose proper cost functions that enjoy two properties: (1) their optimal affine subspace is the solution to an eigenequation, and (2) optimal affine subspaces of different dimensions form a nested set.
- Score: 7.822210329345704
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Principal Component Analysis (PCA) is a workhorse of modern data science. While PCA assumes the data conforms to Euclidean geometry, for specific data types, such as hierarchical and cyclic data structures, other spaces are more appropriate. We study PCA in space forms; that is, those with constant curvatures. At a point on a Riemannian manifold, we can define a Riemannian affine subspace based on a set of tangent vectors. Finding the optimal low-dimensional affine subspace for given points in a space form amounts to dimensionality reduction. Our Space Form PCA (SFPCA) seeks the affine subspace that best represents a set of manifold-valued points with the minimum projection cost. We propose proper cost functions that enjoy two properties: (1) their optimal affine subspace is the solution to an eigenequation, and (2) optimal affine subspaces of different dimensions form a nested set. These properties provide advances over existing methods, which are mostly iterative algorithms with slow convergence and weaker theoretical guarantees. We evaluate the proposed SFPCA on real and simulated data in spherical and hyperbolic spaces. We show that it outperforms alternative methods in estimating true subspaces (in simulated data) with respect to convergence speed or accuracy, often both.
Related papers
- Optimal Bound for PCA with Outliers using Higher-Degree Voronoi Diagrams [0.0]
We introduce new algorithms for Principal Component Analysis (PCA) with outliers.
We navigate to the optimal subspace for PCA even in the presence of outliers.
This approach achieves an optimal solution with a time complexity of $nd+mathcalO(1)textpoly(n,d)$.
arXiv Detail & Related papers (2024-08-13T13:05:36Z) - Information Entropy Initialized Concrete Autoencoder for Optimal Sensor
Placement and Reconstruction of Geophysical Fields [58.720142291102135]
We propose a new approach to the optimal placement of sensors for reconstructing geophysical fields from sparse measurements.
We demonstrate our method on the two examples: (a) temperature and (b) salinity fields around the Barents Sea and the Svalbard group of islands.
We find out that the obtained optimal sensor locations have clear physical interpretation and correspond to the boundaries between sea currents.
arXiv Detail & Related papers (2022-06-28T12:43:38Z) - First-Order Algorithms for Min-Max Optimization in Geodesic Metric
Spaces [93.35384756718868]
min-max algorithms have been analyzed in the Euclidean setting.
We prove that the extraiteient (RCEG) method corrected lastrate convergence at a linear rate.
arXiv Detail & Related papers (2022-06-04T18:53:44Z) - Provably Accurate and Scalable Linear Classifiers in Hyperbolic Spaces [39.71927912296049]
We propose a unified framework for learning scalable and simple hyperbolic linear classifiers.
The gist of our approach is to focus on Poincar'e ball models and formulate the classification problems using tangent space formalisms.
The excellent performance of the Poincar'e second-order and strategic perceptrons shows that the proposed framework can be extended to general machine learning problems in hyperbolic spaces.
arXiv Detail & Related papers (2022-03-07T21:36:21Z) - Measuring dissimilarity with diffeomorphism invariance [94.02751799024684]
We introduce DID, a pairwise dissimilarity measure applicable to a wide range of data spaces.
We prove that DID enjoys properties which make it relevant for theoretical study and practical use.
arXiv Detail & Related papers (2022-02-11T13:51:30Z) - Highly Scalable and Provably Accurate Classification in Poincare Balls [40.82908295137667]
We establish a unified framework for learning scalable and simple hyperbolic linear classifiers with provable performance guarantees.
Our results include a new hyperbolic and second-order perceptron algorithm as well as an efficient and highly accurate convex optimization setup for hyperbolic support vector machine classifiers.
Their performance accuracies on synthetic data sets comprising millions of points, as well as on complex real-world data sets such as single-cell RNA-seq expression measurements, CIFAR10, Fashion-MNIST and mini-ImageNet.
arXiv Detail & Related papers (2021-09-08T16:59:39Z) - Linear Classifiers in Mixed Constant Curvature Spaces [40.82908295137667]
We address the problem of linear classification in a product space form -- a mix of Euclidean, spherical, and hyperbolic spaces.
We prove that linear classifiers in $d$-dimensional constant curvature spaces can shatter exactly $d+1$ points.
We describe a novel perceptron classification algorithm, and establish rigorous convergence results.
arXiv Detail & Related papers (2021-02-19T23:29:03Z) - Joint and Progressive Subspace Analysis (JPSA) with Spatial-Spectral
Manifold Alignment for Semi-Supervised Hyperspectral Dimensionality Reduction [48.73525876467408]
We propose a novel technique for hyperspectral subspace analysis.
The technique is called joint and progressive subspace analysis (JPSA)
Experiments are conducted to demonstrate the superiority and effectiveness of the proposed JPSA on two widely-used hyperspectral datasets.
arXiv Detail & Related papers (2020-09-21T16:29:59Z) - Is an Affine Constraint Needed for Affine Subspace Clustering? [27.00532615975731]
In computer vision applications, the subspaces are linear and subspace clustering methods can be applied directly.
In motion segmentation, the subspaces are affine and an additional affine constraint on the coefficients is often enforced.
This paper shows, both theoretically and empirically, that when the dimension of the ambient space is high relative to the sum of the dimensions of the affine subspaces, the affine constraint has a negligible effect on clustering performance.
arXiv Detail & Related papers (2020-05-08T07:52:17Z) - Learnable Subspace Clustering [76.2352740039615]
We develop a learnable subspace clustering paradigm to efficiently solve the large-scale subspace clustering problem.
The key idea is to learn a parametric function to partition the high-dimensional subspaces into their underlying low-dimensional subspaces.
To the best of our knowledge, this paper is the first work to efficiently cluster millions of data points among the subspace clustering methods.
arXiv Detail & Related papers (2020-04-09T12:53:28Z) - Ellipsoidal Subspace Support Vector Data Description [98.67884574313292]
We propose a novel method for transforming data into a low-dimensional space optimized for one-class classification.
We provide both linear and non-linear formulations for the proposed method.
The proposed method is noticed to converge much faster than recently proposed Subspace Support Vector Data Description.
arXiv Detail & Related papers (2020-03-20T21:31:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.