The committee machine: Computational to statistical gaps in learning a
two-layers neural network
- URL: http://arxiv.org/abs/1806.05451v3
- Date: Thu, 29 Feb 2024 11:10:45 GMT
- Title: The committee machine: Computational to statistical gaps in learning a
two-layers neural network
- Authors: Benjamin Aubin, Antoine Maillard, Jean Barbier, Florent Krzakala,
Nicolas Macris and Lenka Zdeborov\'a
- Abstract summary: Heuristic tools from statistical physics have been used to locate the phase transitions and compute the optimal learning and generalization errors in the teacher-student scenario.
We introduce a version of the approximate message passing (AMP) algorithm for the machine that allows to perform optimal learning in time for a large set of parameters.
We find that there are regimes in which a low generalization error is information-theoretically achievable while the AMP algorithm fails to deliver it, strongly suggesting that no efficient algorithm exists for those cases.
- Score: 29.86621613621785
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Heuristic tools from statistical physics have been used in the past to locate
the phase transitions and compute the optimal learning and generalization
errors in the teacher-student scenario in multi-layer neural networks. In this
contribution, we provide a rigorous justification of these approaches for a
two-layers neural network model called the committee machine. We also introduce
a version of the approximate message passing (AMP) algorithm for the committee
machine that allows to perform optimal learning in polynomial time for a large
set of parameters. We find that there are regimes in which a low generalization
error is information-theoretically achievable while the AMP algorithm fails to
deliver it, strongly suggesting that no efficient algorithm exists for those
cases, and unveiling a large computational gap.
Related papers
- On the Convergence of Distributed Stochastic Bilevel Optimization
Algorithms over a Network [55.56019538079826]
Bilevel optimization has been applied to a wide variety of machine learning models.
Most existing algorithms restrict their single-machine setting so that they are incapable of handling distributed data.
We develop novel decentralized bilevel optimization algorithms based on a gradient tracking communication mechanism and two different gradients.
arXiv Detail & Related papers (2022-06-30T05:29:52Z) - Scalable computation of prediction intervals for neural networks via
matrix sketching [79.44177623781043]
Existing algorithms for uncertainty estimation require modifying the model architecture and training procedure.
This work proposes a new algorithm that can be applied to a given trained neural network and produces approximate prediction intervals.
arXiv Detail & Related papers (2022-05-06T13:18:31Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Analytically Tractable Inference in Deep Neural Networks [0.0]
Tractable Approximate Inference (TAGI) algorithm was shown to be a viable and scalable alternative to backpropagation for shallow fully-connected neural networks.
We are demonstrating how TAGI matches or exceeds the performance of backpropagation, for training classic deep neural network architectures.
arXiv Detail & Related papers (2021-03-09T14:51:34Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Iterative Algorithm Induced Deep-Unfolding Neural Networks: Precoding
Design for Multiuser MIMO Systems [59.804810122136345]
We propose a framework for deep-unfolding, where a general form of iterative algorithm induced deep-unfolding neural network (IAIDNN) is developed.
An efficient IAIDNN based on the structure of the classic weighted minimum mean-square error (WMMSE) iterative algorithm is developed.
We show that the proposed IAIDNN efficiently achieves the performance of the iterative WMMSE algorithm with reduced computational complexity.
arXiv Detail & Related papers (2020-06-15T02:57:57Z) - Exponentially improved detection and correction of errors in
experimental systems using neural networks [0.0]
We introduce the use of two machine learning algorithms to create an empirical model of an experimental apparatus.
This is able to reduce the number of measurements necessary for generic optimisation tasks exponentially.
We demonstrate both algorithms at the example of detecting and compensating stray electric fields in an ion trap.
arXiv Detail & Related papers (2020-05-18T22:42:11Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.