Parallelized Computation and Backpropagation Under Angle-Parametrized
Orthogonal Matrices
- URL: http://arxiv.org/abs/2106.00003v1
- Date: Sun, 30 May 2021 00:47:03 GMT
- Title: Parallelized Computation and Backpropagation Under Angle-Parametrized
Orthogonal Matrices
- Authors: Firas Hamze
- Abstract summary: We show how an apparently sequential elementary rotation parametrization can be restructured into blocks of commutative operations.
We discuss parametric restrictions of interest to generative modeling and present promising performance results with a prototype GPU implementation.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a methodology for parallel acceleration of learning in the
presence of matrix orthogonality and unitarity constraints of interest in
several branches of machine learning. We show how an apparently sequential
elementary rotation parametrization can be restructured into blocks of
commutative operations using a well-known tool for coloring the edges of
complete graphs, in turn widely applied to schedule round-robin
(all-against-all) sports tournaments. The resulting decomposition admits an
algorithm to compute a fully-parametrized orthogonal matrix from its rotation
parameters in $O(n)$ sequential steps and one to compute the gradient of a
training loss with respect to its parameters in $O(n\log n)$ steps. We discuss
parametric restrictions of interest to generative modeling and present
promising performance results with a prototype GPU implementation.
Related papers
- Learning Unorthogonalized Matrices for Rotation Estimation [83.94986875750455]
Estimating 3D rotations is a common procedure for 3D computer vision.
One form of representation -- rotation matrices -- is popular due to its continuity.
We propose unorthogonalized Pseudo' Rotation Matrices (PRoM)
arXiv Detail & Related papers (2023-12-01T09:56:29Z) - Givens Coordinate Descent Methods for Rotation Matrix Learning in
Trainable Embedding Indexes [19.716527782586788]
We propose a family of block Givens coordinate descent algorithms to learn rotation matrix.
Compared to the state-of-the-art SVD method, the Givens algorithms are much more parallelizable.
arXiv Detail & Related papers (2022-03-09T22:58:56Z) - High-Dimensional Sparse Bayesian Learning without Covariance Matrices [66.60078365202867]
We introduce a new inference scheme that avoids explicit construction of the covariance matrix.
Our approach couples a little-known diagonal estimation result from numerical linear algebra with the conjugate gradient algorithm.
On several simulations, our method scales better than existing approaches in computation time and memory.
arXiv Detail & Related papers (2022-02-25T16:35:26Z) - Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.
In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph.
Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z) - Random Matrix Based Extended Target Tracking with Orientation: A New
Model and Inference [0.0]
We propose a novel extended target tracking algorithm which is capable of representing the extent of dynamic objects as an ellipsoid with a time-varying orientation angle.
A diagonal positive semi-definite matrix is defined to model objects' extent within the random matrix framework.
It is not possible to find a closed-form analytical expression for the true posterior because of the absence of conjugacy.
arXiv Detail & Related papers (2020-10-17T16:33:06Z) - Solution Path Algorithm for Twin Multi-class Support Vector Machine [6.97711662470035]
The paper is devoted to the fast regularization parameter tuning algorithm for the twin multi-class support vector machine.
A new sample dataset division method is adopted and the Lagrangian multipliers are proved to be piecewise linear.
The proposed method can achieve good classification performance with reducing the computational cost of grid search method from exponential level to the constant level.
arXiv Detail & Related papers (2020-05-30T14:05:46Z) - CWY Parametrization: a Solution for Parallelized Optimization of
Orthogonal and Stiefel Matrices [41.57234424773276]
We introduce an efficient approach for optimization over orthogonal groups on highly parallel computation units such as GPUs or TPUs.
We further develop a novel Truncated CWY (or T-CWY) approach for Stiefel manifold parametrization.
We apply our methods to train recurrent neural network architectures in the tasks of neural machine video prediction.
arXiv Detail & Related papers (2020-04-18T17:58:43Z) - Stochastic Flows and Geometric Optimization on the Orthogonal Group [52.50121190744979]
We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$.
We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
arXiv Detail & Related papers (2020-03-30T15:37:50Z) - Accelerating Feedforward Computation via Parallel Nonlinear Equation
Solving [106.63673243937492]
Feedforward computation, such as evaluating a neural network or sampling from an autoregressive model, is ubiquitous in machine learning.
We frame the task of feedforward computation as solving a system of nonlinear equations. We then propose to find the solution using a Jacobi or Gauss-Seidel fixed-point method, as well as hybrid methods of both.
Our method is guaranteed to give exactly the same values as the original feedforward computation with a reduced (or equal) number of parallelizable iterations, and hence reduced time given sufficient parallel computing power.
arXiv Detail & Related papers (2020-02-10T10:11:31Z) - Optimal Iterative Sketching with the Subsampled Randomized Hadamard
Transform [64.90148466525754]
We study the performance of iterative sketching for least-squares problems.
We show that the convergence rate for Haar and randomized Hadamard matrices are identical, andally improve upon random projections.
These techniques may be applied to other algorithms that employ randomized dimension reduction.
arXiv Detail & Related papers (2020-02-03T16:17:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.