Logits are predictive of network type
- URL: http://arxiv.org/abs/2211.02272v1
- Date: Fri, 4 Nov 2022 05:53:27 GMT
- Title: Logits are predictive of network type
- Authors: Ali Borji
- Abstract summary: It is possible to predict which deep network has generated a given logit vector with accuracy well above chance.
We utilize a number of networks on a dataset, with random weights or pretrained weights, as well as fine-tuned networks.
- Score: 47.64219291655723
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We show that it is possible to predict which deep network has generated a
given logit vector with accuracy well above chance. We utilize a number of
networks on a dataset, initialized with random weights or pretrained weights,
as well as fine-tuned networks. A classifier is then trained on the logit
vectors of the trained set of this dataset to map the logit vector to the
network index that has generated it. The classifier is then evaluated on the
test set of the dataset. Results are better with randomly initialized networks,
but also generalize to pretrained networks as well as fine-tuned ones.
Classification accuracy is higher using unnormalized logits than normalized
ones. We find that there is little transfer when applying a classifier to the
same networks but with different sets of weights. In addition to help better
understand deep networks and the way they encode uncertainty, we anticipate our
finding to be useful in some applications (e.g. tailoring an adversarial attack
for a certain type of network). Code is available at
https://github.com/aliborji/logits.
Related papers
- You Can Have Better Graph Neural Networks by Not Training Weights at
All: Finding Untrained GNNs Tickets [105.24703398193843]
Untrainedworks in graph neural networks (GNNs) still remains mysterious.
We show that the found untrainedworks can substantially mitigate the GNN over-smoothing problem.
We also observe that such sparse untrainedworks have appealing performance in out-of-distribution detection and robustness of input perturbations.
arXiv Detail & Related papers (2022-11-28T14:17:36Z) - The smooth output assumption, and why deep networks are better than wide
ones [0.0]
We propose a new measure that predicts how well a model will generalize.
It is based on the fact that, in reality, boundaries between concepts are generally unsharp.
arXiv Detail & Related papers (2022-11-25T19:05:44Z) - Wide and Deep Neural Networks Achieve Optimality for Classification [23.738242876364865]
We identify and construct an explicit set of neural network classifiers that achieve optimality.
In particular, we provide explicit activation functions that can be used to construct networks that achieve optimality.
Our results highlight the benefit of using deep networks for classification tasks, in contrast to regression tasks, where excessive depth is harmful.
arXiv Detail & Related papers (2022-04-29T14:27:42Z) - Dual Lottery Ticket Hypothesis [71.95937879869334]
Lottery Ticket Hypothesis (LTH) provides a novel view to investigate sparse network training and maintain its capacity.
In this work, we regard the winning ticket from LTH as the subnetwork which is in trainable condition and its performance as our benchmark.
We propose a simple sparse network training strategy, Random Sparse Network Transformation (RST), to substantiate our DLTH.
arXiv Detail & Related papers (2022-03-08T18:06:26Z) - Bit-wise Training of Neural Network Weights [4.56877715768796]
We introduce an algorithm where the individual bits representing the weights of a neural network are learned.
This method allows training weights with integer values on arbitrary bit-depths and naturally uncovers sparse networks.
We show better results than the standard training technique with fully connected networks and similar performance as compared to standard training for convolutional and residual networks.
arXiv Detail & Related papers (2022-02-19T10:46:54Z) - Probing Predictions on OOD Images via Nearest Categories [97.055916832257]
We study out-of-distribution (OOD) prediction behavior of neural networks when they classify images from unseen classes or corrupted images.
We introduce a new measure, nearest category generalization (NCG), where we compute the fraction of OOD inputs that are classified with the same label as their nearest neighbor in the training set.
We find that robust networks have consistently higher NCG accuracy than natural training, even when the OOD data is much farther away than the robustness radius.
arXiv Detail & Related papers (2020-11-17T07:42:27Z) - Is Each Layer Non-trivial in CNN? [11.854634156817642]
Convolutional neural network (CNN) models have achieved great success in many fields.
With the advent of ResNet, networks used in practice are getting deeper and wider.
We trained a network on the training set, then we replace the network convolution kernels with zeros and test the result models on the test set.
arXiv Detail & Related papers (2020-09-09T02:17:49Z) - Digit Image Recognition Using an Ensemble of One-Versus-All Deep Network
Classifiers [2.385916960125935]
We implement a novel technique for the case of digit image recognition and test and evaluate it on the same.
Every network in an ensemble has been trained by an OVA training technique using the Gradient Descent with Momentum (SGDMA)
Our proposed technique outperforms the baseline on digit image recognition for all datasets.
arXiv Detail & Related papers (2020-06-28T15:37:39Z) - Text Classification with Few Examples using Controlled Generalization [58.971750512415134]
Current practice relies on pre-trained word embeddings to map words unseen in training to similar seen ones.
Our alternative begins with sparse pre-trained representations derived from unlabeled parsed corpora.
We show that a feed-forward network over these vectors is especially effective in low-data scenarios.
arXiv Detail & Related papers (2020-05-18T06:04:58Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.