Related papers: On the Information Capacity of Nearest Neighbor Representations

On the Information Capacity of Nearest Neighbor Representations

URL: http://arxiv.org/abs/2305.05808v1
Date: Tue, 9 May 2023 23:45:16 GMT
Title: On the Information Capacity of Nearest Neighbor Representations
Authors: Kordag Mehmet Kilic, Jin Sima, Jehoshua Bruck
Abstract summary: The brain has an integrated architecture where computation and memory are indistinguishable. Motivated by the architecture of the brain, we propose a model of $textitassociative computation$.
Score: 21.915057426589748
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The $\textit{von Neumann Computer Architecture}$ has a distinction between computation and memory. In contrast, the brain has an integrated architecture where computation and memory are indistinguishable. Motivated by the architecture of the brain, we propose a model of $\textit{associative computation}$ where memory is defined by a set of vectors in $\mathbb{R}^n$ (that we call $\textit{anchors}$), computation is performed by convergence from an input vector to a nearest neighbor anchor, and the output is a label associated with an anchor. Specifically, in this paper, we study the representation of Boolean functions in the associative computation model, where the inputs are binary vectors and the corresponding outputs are the labels ($0$ or $1$) of the nearest neighbor anchors. The information capacity of a Boolean function in this model is associated with two quantities: $\textit{(i)}$ the number of anchors (called $\textit{Nearest Neighbor (NN) Complexity}$) and $\textit{(ii)}$ the maximal number of bits representing entries of anchors (called $\textit{Resolution}$). We study symmetric Boolean functions and present constructions that have optimal NN complexity and resolution.

Related papers

Sample and Computationally Efficient Robust Learning of Gaussian Single-Index Models [37.42736399673992]
A single-index model (SIM) is a function of the form $sigma(mathbfwast cdot mathbfx)$, where $sigma: mathbbR to mathbbR$ is a known link function and $mathbfwast$ is a hidden unit vector. We show that a proper learner attains $L2$-error of $O(mathrmOPT)+epsilon$, where $
arXiv Detail & Related papers (2024-11-08T17:10:38Z)
On computational complexity of unitary and state design properties [2.06242362470764]
We study unitary and state $t$-designs from a computational complexity theory perspective. We provide a quantum algorithm for computing the frame potential and show that 1 exact computation can be achieved. Our results illustrate the computationally hard nature of unitary and state designs.
arXiv Detail & Related papers (2024-10-30T18:00:35Z)
The Communication Complexity of Approximating Matrix Rank [50.6867896228563]
We show that this problem has randomized communication complexity $Omega(frac1kcdot n2log|mathbbF|)$. As an application, we obtain an $Omega(frac1kcdot n2log|mathbbF|)$ space lower bound for any streaming algorithm with $k$ passes.
arXiv Detail & Related papers (2024-10-26T06:21:42Z)
Approximation of the Proximal Operator of the $\ell_\infty$ Norm Using a Neural Network [1.7265013728931]
We present an approximation of $textbfprox_alphacdot||infty(mathbfx)$ using a neural network. A novel aspect of the network is that it is able to accept vectors of varying lengths due to a feature selection process. We show that the network outperforms a "vanilla neural network" that does not use feature selection.
arXiv Detail & Related papers (2024-08-20T22:12:30Z)
Learning sum of diverse features: computational hardness and efficient gradient-based training for ridge combinations [40.77319247558742]
We study the computational complexity of learning a target function $f_*:mathbbRdtomathbbR$ with additive structure. We prove that a large subset of $f_*$ can be efficiently learned by gradient training of a two-layer neural network.
arXiv Detail & Related papers (2024-06-17T17:59:17Z)
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$ under isotropic Gaussian data. We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ of arbitrary link function with a sample and runtime complexity of $n asymp T asymp C(q) cdot d
arXiv Detail & Related papers (2024-06-03T17:56:58Z)
Nearest Neighbor Representations of Neurons [12.221087476416056]
The Nearest Neighbor (NN) Representation is an emerging computational model that is inspired by the brain. We study the complexity of representing a neuron (threshold function) using the NN representations.
arXiv Detail & Related papers (2024-02-13T19:33:41Z)
Agnostically Learning Multi-index Models with Queries [54.290489524576756]
We study the power of query access for the task of agnostic learning under the Gaussian distribution. We show that query access gives significant runtime improvements over random examples for agnostically learning MIMs.
arXiv Detail & Related papers (2023-12-27T15:50:47Z)
Why should autoencoders work? [1.6317061277457001]
Deep neural network autoencoders are routinely used for model reduction. We show that the technique is found to "work" well, which leads one to ask if there is a way to explain this effectiveness.
arXiv Detail & Related papers (2023-10-03T17:53:43Z)
Matching Map Recovery with an Unknown Number of Outliers [0.0]
We consider the problem of finding the matching map between two sets of $d$-dimensional noisy feature-vectors. We show that, in the high-dimensional setting, if the signal-to-noise ratio is larger than $5(dlog(4nm/alpha))1/4$, the true matching map can be recovered with probability $1-alpha$.
arXiv Detail & Related papers (2022-10-24T15:59:10Z)
Learning a Single Neuron with Adversarial Label Noise via Gradient Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations. The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z)
Random matrices in service of ML footprint: ternary random features with no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$. We propose a novel random technique, called Ternary Random Feature (TRF) The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z)
RNNs can generate bounded hierarchical languages with optimal memory [113.73133308478612]
We show that RNNs can efficiently generate bounded hierarchical languages that reflect the scaffolding of natural language syntax. We introduce Dyck-($k$,$m$), the language of well-nested brackets (of $k$ types) and $m$-bounded nesting depth. We prove that an RNN with $O(m log k)$ hidden units suffices, an exponential reduction in memory, by an explicit construction.
arXiv Detail & Related papers (2020-10-15T04:42:29Z)
AlgebraNets [35.311476442807766]
We study alternative algebras as number representations using the enwiki8 and WikiText-103 datasets. We consider $mathbbC$, $mathbbH$, $M_2(mathbbR)$, $M_3(mathbbR)$ and $M_4(mathbbR)$. multiplication in these algebras has higher compute density than real multiplication.
arXiv Detail & Related papers (2020-06-12T17:51:20Z)
On the Modularity of Hypernetworks [103.1147622394852]
We show that for a structured target function, the overall number of trainable parameters in a hypernetwork is smaller by orders of magnitude than the number of trainable parameters of a standard neural network and an embedding method.
arXiv Detail & Related papers (2020-02-23T22:51:52Z)
Agnostic Q-learning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity [94.37110094442136]
We study the problem of agnostic $Q$-learning with function approximation in deterministic systems. We show that if $delta = Oleft(rho/sqrtdim_Eright)$, then one can find the optimal policy using $Oleft(dim_Eright)$.
arXiv Detail & Related papers (2020-02-17T18:41:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.