Towards Flexible Sparsity-Aware Modeling: Automatic Tensor Rank Learning
Using The Generalized Hyperbolic Prior
- URL: http://arxiv.org/abs/2009.02472v2
- Date: Tue, 29 Mar 2022 07:22:08 GMT
- Title: Towards Flexible Sparsity-Aware Modeling: Automatic Tensor Rank Learning
Using The Generalized Hyperbolic Prior
- Authors: Lei Cheng, Zhongtao Chen, Qingjiang Shi, Yik-Chung Wu, and Sergios
Theodoridis
- Abstract summary: rank learning for canonical polyadic decomposition (CPD) has long been deemed as an essential yet challenging problem.
The optimal determination of a tensor rank is known to be a non-deterministic-time hard (NP-hard) task.
In this paper, we introduce a more advanced generalized hyperbolic (GH) prior to the probabilistic modeling model, which is more flexible to adapt to different levels of sparsity.
- Score: 24.848237413017937
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tensor rank learning for canonical polyadic decomposition (CPD) has long been
deemed as an essential yet challenging problem. In particular, since the tensor
rank controls the complexity of the CPD model, its inaccurate learning would
cause overfitting to noise or underfitting to the signal sources, and even
destroy the interpretability of model parameters. However, the optimal
determination of a tensor rank is known to be a non-deterministic
polynomial-time hard (NP-hard) task. Rather than exhaustively searching for the
best tensor rank via trial-and-error experiments, Bayesian inference under the
Gaussian-gamma prior was introduced in the context of probabilistic CPD
modeling, and it was shown to be an effective strategy for automatic tensor
rank determination. This triggered flourishing research on other structured
tensor CPDs with automatic tensor rank learning. On the other side of the coin,
these research works also reveal that the Gaussian-gamma model does not perform
well for high-rank tensors and/or low signal-to-noise ratios (SNRs). To
overcome these drawbacks, in this paper, we introduce a more advanced
generalized hyperbolic (GH) prior to the probabilistic CPD model, which not
only includes the Gaussian-gamma model as a special case, but also is more
flexible to adapt to different levels of sparsity. Based on this novel
probabilistic model, an algorithm is developed under the framework of
variational inference, where each update is obtained in a closed-form.
Extensive numerical results, using synthetic data and real-world datasets,
demonstrate the significantly improved performance of the proposed method in
learning both low as well as high tensor ranks even for low SNR cases.
Related papers
- Model-Based Reparameterization Policy Gradient Methods: Theory and
Practical Algorithms [88.74308282658133]
Reization (RP) Policy Gradient Methods (PGMs) have been widely adopted for continuous control tasks in robotics and computer graphics.
Recent studies have revealed that, when applied to long-term reinforcement learning problems, model-based RP PGMs may experience chaotic and non-smooth optimization landscapes.
We propose a spectral normalization method to mitigate the exploding variance issue caused by long model unrolls.
arXiv Detail & Related papers (2023-10-30T18:43:21Z) - Parallel and Limited Data Voice Conversion Using Stochastic Variational
Deep Kernel Learning [2.5782420501870296]
This paper proposes a voice conversion method that works with limited data.
It is based on variational deep kernel learning (SVDKL)
It is possible to estimate non-smooth and more complex functions.
arXiv Detail & Related papers (2023-09-08T16:32:47Z) - Statistical and computational rates in high rank tensor estimation [11.193504036335503]
Higher-order tensor datasets arise commonly in recommendation systems, neuroimaging, and social networks.
We consider a generative latent variable tensor model that incorporates both high rank and low rank models.
We show that the statistical-computational gap emerges only for latent variable tensors of order 3 or higher.
arXiv Detail & Related papers (2023-04-08T15:34:26Z) - Truncated tensor Schatten p-norm based approach for spatiotemporal
traffic data imputation with complicated missing patterns [77.34726150561087]
We introduce four complicated missing patterns, including missing and three fiber-like missing cases according to the mode-drivenn fibers.
Despite nonity of the objective function in our model, we derive the optimal solutions by integrating alternating data-mputation method of multipliers.
arXiv Detail & Related papers (2022-05-19T08:37:56Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Scaling and Scalability: Provable Nonconvex Low-Rank Tensor Estimation
from Incomplete Measurements [30.395874385570007]
A fundamental task is to faithfully recover tensors from highly incomplete measurements.
We develop an algorithm to directly recover the tensor factors in the Tucker decomposition.
We show that it provably converges at a linear independent rate of the ground truth tensor for two canonical problems.
arXiv Detail & Related papers (2021-04-29T17:44:49Z) - A Distributed Optimisation Framework Combining Natural Gradient with
Hessian-Free for Discriminative Sequence Training [16.83036203524611]
This paper presents a novel natural gradient and Hessian-free (NGHF) optimisation framework for neural network training.
It relies on the linear conjugate gradient (CG) algorithm to combine the natural gradient (NG) method with local curvature information from Hessian-free (HF) or other second-order methods.
Experiments are reported on the multi-genre broadcast data set for a range of different acoustic model types.
arXiv Detail & Related papers (2021-03-12T22:18:34Z) - Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve
Optimism, Embrace Virtual Curvature [61.22680308681648]
We show that global convergence is statistically intractable even for one-layer neural net bandit with a deterministic reward.
For both nonlinear bandit and RL, the paper presents a model-based algorithm, Virtual Ascent with Online Model Learner (ViOL)
arXiv Detail & Related papers (2021-02-08T12:41:56Z) - Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear.
We show that it commonly arises in parameters of discrete multiplicative noise due to variance.
A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z) - Stochasticity in Neural ODEs: An Empirical Study [68.8204255655161]
Regularization of neural networks (e.g. dropout) is a widespread technique in deep learning that allows for better generalization.
We show that data augmentation during the training improves the performance of both deterministic and versions of the same model.
However, the improvements obtained by the data augmentation completely eliminate the empirical regularization gains, making the performance of neural ODE and neural SDE negligible.
arXiv Detail & Related papers (2020-02-22T22:12:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.