Exponential Lower Bounds for Threshold Circuits of Sub-Linear Depth and
Energy
- URL: http://arxiv.org/abs/2107.00223v2
- Date: Wed, 28 Jun 2023 03:49:37 GMT
- Title: Exponential Lower Bounds for Threshold Circuits of Sub-Linear Depth and
Energy
- Authors: Kei Uchizawa and Haruki Abe
- Abstract summary: We prove that any threshold circuit $C$ of size $s$, depth $d$, energy $e$ and weight $w$ satisfies $log (rk(M_C)) le ed.
For other models of neural networks such as a discretized ReLE circuits and decretized sigmoid circuits, we prove that a similar inequality also holds for a discretized circuit $C$.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we investigate computational power of threshold circuits and
other theoretical models of neural networks in terms of the following four
complexity measures: size (the number of gates), depth, weight and energy. Here
the energy complexity of a circuit measures sparsity of their computation, and
is defined as the maximum number of gates outputting non-zero values taken over
all the input assignments. As our main result, we prove that any threshold
circuit $C$ of size $s$, depth $d$, energy $e$ and weight $w$ satisfies $\log
(rk(M_C)) \le ed (\log s + \log w + \log n)$, where $rk(M_C)$ is the rank of
the communication matrix $M_C$ of a $2n$-variable Boolean function that $C$
computes. Thus, such a threshold circuit $C$ is able to compute only a Boolean
function of which communication matrix has rank bounded by a product of
logarithmic factors of $s,w$ and linear factors of $d,e$. This implies an
exponential lower bound on the size of even sublinear-depth threshold circuit
if energy and weight are sufficiently small. For other models of neural
networks such as a discretized ReLE circuits and decretized sigmoid circuits,
we prove that a similar inequality also holds for a discretized circuit $C$:
$rk(M_C) = O(ed(\log s + \log w + \log n)^3)$.
Related papers
- Low-degree approximation of QAC$^0$ circuits [0.0]
We show that the parity function cannot be computed in QAC$0$.
We also show that any QAC circuit of depth $d$ that approximately computes parity on $n$ bits requires $2widetildeOmega(n1/d)$.
arXiv Detail & Related papers (2024-11-01T19:04:13Z) - The Communication Complexity of Approximating Matrix Rank [50.6867896228563]
We show that this problem has randomized communication complexity $Omega(frac1kcdot n2log|mathbbF|)$.
As an application, we obtain an $Omega(frac1kcdot n2log|mathbbF|)$ space lower bound for any streaming algorithm with $k$ passes.
arXiv Detail & Related papers (2024-10-26T06:21:42Z) - On the Computational Power of QAC0 with Barely Superlinear Ancillae [10.737102385599169]
We show that any depth-$d$ $mathrmQAC0$ circuit requires $n1+3-d$ ancillae to compute a function with approximate degree $ta(n)$.
This is the first superlinear lower bound on the super-linear sized $mathrmQAC0$.
arXiv Detail & Related papers (2024-10-09T02:55:57Z) - Unconditionally separating noisy $\mathsf{QNC}^0$ from bounded polynomial threshold circuits of constant depth [8.66267734067296]
We study classes of constant-depth circuits with bounds that compute restricted threshold functions.
For large enough values of $mathsfbPTFC0[k]$, $mathsfbPTFC0[k] contains $mathsfTC0[k].
arXiv Detail & Related papers (2024-08-29T09:40:55Z) - Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$ under isotropic Gaussian data.
We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ of arbitrary link function with a sample and runtime complexity of $n asymp T asymp C(q) cdot d
arXiv Detail & Related papers (2024-06-03T17:56:58Z) - Detection-Recovery Gap for Planted Dense Cycles [72.4451045270967]
We consider a model where a dense cycle with expected bandwidth $n tau$ and edge density $p$ is planted in an ErdHos-R'enyi graph $G(n,q)$.
We characterize the computational thresholds for the associated detection and recovery problems for the class of low-degree algorithms.
arXiv Detail & Related papers (2023-02-13T22:51:07Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - A lower bound on the space overhead of fault-tolerant quantum computation [51.723084600243716]
The threshold theorem is a fundamental result in the theory of fault-tolerant quantum computation.
We prove an exponential upper bound on the maximal length of fault-tolerant quantum computation with amplitude noise.
arXiv Detail & Related papers (2022-01-31T22:19:49Z) - On the Optimal Memorization Power of ReLU Neural Networks [53.15475693468925]
We show that feedforward ReLU neural networks can memorization any $N$ points that satisfy a mild separability assumption.
We prove that having such a large bit complexity is both necessary and sufficient for memorization with a sub-linear number of parameters.
arXiv Detail & Related papers (2021-10-07T05:25:23Z) - Bounds on the QAC$^0$ Complexity of Approximating Parity [0.0]
We prove that QAC circuits of sublogarithmic depth can approximate parity regardless of size.
QAC circuits require at least $Omega(n/d)$ multi-qubit gates to achieve a $1/2 + exp(-o(n/d))$ approximation of parity.
arXiv Detail & Related papers (2020-08-17T16:51:04Z) - Constant-Depth and Subcubic-Size Threshold Circuits for Matrix
Multiplication [1.9518237361775532]
Recent advances in large-scale neural computing hardware has made their practical implementation a near-term possibility.
We describe a theoretical approach for multiplying two $N$ by $N$ matrices that integrates threshold gate logic.
Dense matrix multiplication is a core operation in convolutional neural network training.
arXiv Detail & Related papers (2020-06-25T18:28:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.