Related papers: Efficiently Learning Any One Hidden Layer ReLU Network From Queries

Efficiently Learning Any One Hidden Layer ReLU Network From Queries

URL: http://arxiv.org/abs/2111.04727v1
Date: Mon, 8 Nov 2021 18:59:40 GMT
Title: Efficiently Learning Any One Hidden Layer ReLU Network From Queries
Authors: Sitan Chen, Adam R Klivans, Raghu Meka
Abstract summary: We give the first-time algorithm for learning arbitrary one hidden layer neural networks activations provided black-box access to the network. Ours is the first with fully-time guarantees of efficiency even for worst-case networks.
Score: 27.428198343906352
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Model extraction attacks have renewed interest in the classic problem of learning neural networks from queries. In this work we give the first polynomial-time algorithm for learning arbitrary one hidden layer neural networks activations provided black-box access to the network. Formally, we show that if $F$ is an arbitrary one hidden layer neural network with ReLU activations, there is an algorithm with query complexity and running time that is polynomial in all parameters that outputs a network $F'$ achieving low square loss relative to $F$ with respect to the Gaussian measure. While a number of works in the security literature have proposed and empirically demonstrated the effectiveness of certain algorithms for this problem, ours is the first with fully polynomial-time guarantees of efficiency even for worst-case networks (in particular our algorithm succeeds in the overparameterized setting).

Related papers

Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification [22.361338848134025]
We present a fully connected two-layer neural network for shifted ReLU activation to enable activated neuron identification in sublinear time via geometric search. We also prove that our algorithm can converge in $O(M2/epsilon2)$ time with network size quadratic in the coefficient norm upper bound $M$ and error term $epsilon$.
arXiv Detail & Related papers (2023-07-13T05:33:44Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Robust Training and Verification of Implicit Neural Networks: A Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks. We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network. We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z)
Algorithms for Efficiently Learning Low-Rank Neural Networks [12.916132936159713]
We study algorithms for learning low-rank neural networks. We present a provably efficient algorithm which learns an optimal low-rank approximation to a single-hidden-layer ReLU network. We propose a novel low-rank framework for training low-rank $textitdeep$ networks.
arXiv Detail & Related papers (2022-02-02T01:08:29Z)
Learning Contextual Bandits Through Perturbed Rewards [107.6210145983805]
We show that a $tildeO(tildedsqrtT)$ regret upper bound is still achievable under standard regularity conditions. We perturb the rewards when updating the neural network to eliminate the need of explicit exploration.
arXiv Detail & Related papers (2022-01-24T19:10:22Z)
Neural networks with linear threshold activations: structure and algorithms [1.795561427808824]
We show that 2 hidden layers are necessary and sufficient to represent any function representable in the class. We also give precise bounds on the sizes of the neural networks required to represent any function in the class. We propose a new class of neural networks that we call shortcut linear threshold networks.
arXiv Detail & Related papers (2021-11-15T22:33:52Z)
Efficient Algorithms for Learning Depth-2 Neural Networks with General ReLU Activations [27.244958998196623]
We present time and sample efficient algorithms for learning an unknown depth-2 feedforward neural network with general ReLU activations. In particular, we consider learning an unknown network of the form $f(x) = amathsfTsigma(WmathsfTx+b)$, where $x$ is drawn from the Gaussian distribution, and $sigma(t) := max(t,0)$ is the ReLU activation.
arXiv Detail & Related papers (2021-07-21T17:06:03Z)
An Empirical Study of Derivative-Free-Optimization Algorithms for Targeted Black-Box Attacks in Deep Neural Networks [8.368543987898732]
This paper considers four pre-existing state-of-the-art DFO-based algorithms along with the introduction of a new algorithm built on BOBYQA. We compare these algorithms in a variety of settings according to the fraction of images that they successfully misclassify. Experiments disclose how the likelihood of finding an adversarial example depends on both the algorithm used and the setting of the attack.
arXiv Detail & Related papers (2020-12-03T13:32:20Z)
Neural Contextual Bandits with Deep Representation and Shallow Exploration [105.8099566651448]
We propose a novel learning algorithm that transforms the raw feature vector using the last hidden layer of a deep ReLU neural network. Compared with existing neural contextual bandit algorithms, our approach is computationally much more efficient since it only needs to explore in the last layer of the deep neural network.
arXiv Detail & Related papers (2020-12-03T09:17:55Z)
Neural Thompson Sampling [94.82847209157494]
We propose a new algorithm, called Neural Thompson Sampling, which adapts deep neural networks for both exploration and exploitation. At the core of our algorithm is a novel posterior distribution of the reward, where its mean is the neural network approximator, and its variance is built upon the neural tangent features of the corresponding neural network.
arXiv Detail & Related papers (2020-10-02T07:44:09Z)
ESPN: Extremely Sparse Pruned Networks [50.436905934791035]
We show that a simple iterative mask discovery method can achieve state-of-the-art compression of very deep networks. Our algorithm represents a hybrid approach between single shot network pruning methods and Lottery-Ticket type approaches.
arXiv Detail & Related papers (2020-06-28T23:09:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.