Inference in Multi-Layer Networks with Matrix-Valued Unknowns
- URL: http://arxiv.org/abs/2001.09396v1
- Date: Sun, 26 Jan 2020 04:00:24 GMT
- Title: Inference in Multi-Layer Networks with Matrix-Valued Unknowns
- Authors: Parthe Pandit, Mojtaba Sahraee-Ardakan, Sundeep Rangan, Philip
Schniter, Alyson K. Fletcher
- Abstract summary: We consider the problem of inferring the input and hidden variables of a multi-layer neural network from an observation of the output.
A unified approximation algorithm for both MAP and MMSE inference is proposed.
It is shown that the performance of the proposed Multi-Layer Matrix VAMP (ML-Mat-VAMP) algorithm can be exactly predicted in a certain random large-system limit.
- Score: 32.635971570510755
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of inferring the input and hidden variables of a
stochastic multi-layer neural network from an observation of the output. The
hidden variables in each layer are represented as matrices. This problem
applies to signal recovery via deep generative prior models, multi-task and
mixed regression and learning certain classes of two-layer neural networks. A
unified approximation algorithm for both MAP and MMSE inference is proposed by
extending a recently-developed Multi-Layer Vector Approximate Message Passing
(ML-VAMP) algorithm to handle matrix-valued unknowns. It is shown that the
performance of the proposed Multi-Layer Matrix VAMP (ML-Mat-VAMP) algorithm can
be exactly predicted in a certain random large-system limit, where the
dimensions $N\times d$ of the unknown quantities grow as $N\rightarrow\infty$
with $d$ fixed. In the two-layer neural-network learning problem, this scaling
corresponds to the case where the number of input features and training samples
grow to infinity but the number of hidden nodes stays fixed. The analysis
enables a precise prediction of the parameter and test error of the learning.
Related papers
- A Nonoverlapping Domain Decomposition Method for Extreme Learning Machines: Elliptic Problems [0.0]
Extreme learning machine (ELM) is a methodology for solving partial differential equations (PDEs) using a single hidden layer feed-forward neural network.
In this paper, we propose a nonoverlapping domain decomposition method (DDM) for ELMs that not only reduces the training time of ELMs, but is also suitable for parallel computation.
arXiv Detail & Related papers (2024-06-22T23:25:54Z) - Sliding down the stairs: how correlated latent variables accelerate learning with neural networks [8.107431208836426]
We show that correlations between latent variables along directions encoded in different input cumulants speed up learning from higher-order correlations.
Our results are confirmed in simulations of two-layer neural networks.
arXiv Detail & Related papers (2024-04-12T17:01:25Z) - An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks.
The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions.
We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z) - Optimization Guarantees of Unfolded ISTA and ADMM Networks With Smooth
Soft-Thresholding [57.71603937699949]
We study optimization guarantees, i.e., achieving near-zero training loss with the increase in the number of learning epochs.
We show that the threshold on the number of training samples increases with the increase in the network width.
arXiv Detail & Related papers (2023-09-12T13:03:47Z) - Algorithms for Efficiently Learning Low-Rank Neural Networks [12.916132936159713]
We study algorithms for learning low-rank neural networks.
We present a provably efficient algorithm which learns an optimal low-rank approximation to a single-hidden-layer ReLU network.
We propose a novel low-rank framework for training low-rank $textitdeep$ networks.
arXiv Detail & Related papers (2022-02-02T01:08:29Z) - Memory-Efficient Backpropagation through Large Linear Layers [107.20037639738433]
In modern neural networks like Transformers, linear layers require significant memory to store activations during backward pass.
This study proposes a memory reduction approach to perform backpropagation through linear layers.
arXiv Detail & Related papers (2022-01-31T13:02:41Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - The Rate of Convergence of Variation-Constrained Deep Neural Networks [35.393855471751756]
We show that a class of variation-constrained neural networks can achieve near-parametric rate $n-1/2+delta$ for an arbitrarily small constant $delta$.
The result indicates that the neural function space needed for approximating smooth functions may not be as large as what is often perceived.
arXiv Detail & Related papers (2021-06-22T21:28:00Z) - Solving Sparse Linear Inverse Problems in Communication Systems: A Deep
Learning Approach With Adaptive Depth [51.40441097625201]
We propose an end-to-end trainable deep learning architecture for sparse signal recovery problems.
The proposed method learns how many layers to execute to emit an output, and the network depth is dynamically adjusted for each task in the inference phase.
arXiv Detail & Related papers (2020-10-29T06:32:53Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Multiple Angles of Arrival Estimation using Neural Networks [2.233624388203003]
We propose a neural network to estimate the azimuth and elevation angles, based on the correlated matrix extracted from received data.
The result shows the neural network can achieve an accurate estimation under low SNR and deal with multiple signals.
arXiv Detail & Related papers (2020-02-03T02:37:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.