Hardness of Learning Neural Networks with Natural Weights
- URL: http://arxiv.org/abs/2006.03177v2
- Date: Tue, 13 Oct 2020 23:18:45 GMT
- Title: Hardness of Learning Neural Networks with Natural Weights
- Authors: Amit Daniely and Gal Vardi
- Abstract summary: We show that for depth-$2$ networks, and many "natural" weights distributions such as the normal and the uniform distribution, most networks are hard to learn.
Namely, there is no efficient learning algorithm that is provably successful for most weights, and every input distribution.
- Score: 36.32177840361928
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural networks are nowadays highly successful despite strong hardness
results. The existing hardness results focus on the network architecture, and
assume that the network's weights are arbitrary. A natural approach to settle
the discrepancy is to assume that the network's weights are "well-behaved" and
posses some generic properties that may allow efficient learning. This approach
is supported by the intuition that the weights in real-world networks are not
arbitrary, but exhibit some "random-like" properties with respect to some
"natural" distributions. We prove negative results in this regard, and show
that for depth-$2$ networks, and many "natural" weights distributions such as
the normal and the uniform distribution, most networks are hard to learn.
Namely, there is no efficient learning algorithm that is provably successful
for most weights, and every input distribution. It implies that there is no
generic property that holds with high probability in such random networks and
allows efficient learning.
Related papers
- Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks.
We show that the networks acquire strong, data-dependent features.
Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z) - Neural Redshift: Random Networks are not Random Functions [28.357640341268745]
We show that NNs do not have an inherent "simplicity bias"
Alternative architectures can be built with a bias for any level of complexity.
It points to promising avenues for controlling the solutions implemented by trained models.
arXiv Detail & Related papers (2024-03-04T17:33:20Z) - Beyond IID weights: sparse and low-rank deep Neural Networks are also Gaussian Processes [3.686808512438363]
We extend the proof of Matthews et al. to a larger class of initial weight distributions.
We show that fully-connected and convolutional networks with PSEUDO-IID distributions are all effectively equivalent up to their variance.
Using our results, one can identify the Edge-of-Chaos for a broader class of neural networks and tune them at criticality in order to enhance their training.
arXiv Detail & Related papers (2023-10-25T12:38:36Z) - Computational Complexity of Learning Neural Networks: Smoothness and
Degeneracy [52.40331776572531]
We show that learning depth-$3$ ReLU networks under the Gaussian input distribution is hard even in the smoothed-analysis framework.
Our results are under a well-studied assumption on the existence of local pseudorandom generators.
arXiv Detail & Related papers (2023-02-15T02:00:26Z) - You Can Have Better Graph Neural Networks by Not Training Weights at
All: Finding Untrained GNNs Tickets [105.24703398193843]
Untrainedworks in graph neural networks (GNNs) still remains mysterious.
We show that the found untrainedworks can substantially mitigate the GNN over-smoothing problem.
We also observe that such sparse untrainedworks have appealing performance in out-of-distribution detection and robustness of input perturbations.
arXiv Detail & Related papers (2022-11-28T14:17:36Z) - Bit-wise Training of Neural Network Weights [4.56877715768796]
We introduce an algorithm where the individual bits representing the weights of a neural network are learned.
This method allows training weights with integer values on arbitrary bit-depths and naturally uncovers sparse networks.
We show better results than the standard training technique with fully connected networks and similar performance as compared to standard training for convolutional and residual networks.
arXiv Detail & Related papers (2022-02-19T10:46:54Z) - The Unreasonable Effectiveness of Random Pruning: Return of the Most
Naive Baseline for Sparse Training [111.15069968583042]
Random pruning is arguably the most naive way to attain sparsity in neural networks, but has been deemed uncompetitive by either post-training pruning or sparse training.
We empirically demonstrate that sparsely training a randomly pruned network from scratch can match the performance of its dense equivalent.
Our results strongly suggest there is larger-than-expected room for sparse training at scale, and the benefits of sparsity might be more universal beyond carefully designed pruning.
arXiv Detail & Related papers (2022-02-05T21:19:41Z) - How Powerful are Shallow Neural Networks with Bandlimited Random
Weights? [25.102870584507244]
We investigate the expressive power of limited depth-2 band random neural networks.
A random net is a neural network where the hidden layer parameters are frozen with random bandwidth.
arXiv Detail & Related papers (2020-08-19T13:26:12Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.