Tensor Normalization and Full Distribution Training
- URL: http://arxiv.org/abs/2109.02345v1
- Date: Mon, 6 Sep 2021 10:33:17 GMT
- Title: Tensor Normalization and Full Distribution Training
- Authors: Wolfgang Fuhl
- Abstract summary: pixel wise normalization, which is inserted after linear units and batch normalization, provides a significant improvement in the accuracy of modern deep neural networks.
We show that the factorized superposition of images from the training set and the reformulation of the multi class problem into a multi-label problem yields significantly more robust networks.
- Score: 3.962145079528281
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we introduce pixel wise tensor normalization, which is inserted
after rectifier linear units and, together with batch normalization, provides a
significant improvement in the accuracy of modern deep neural networks. In
addition, this work deals with the robustness of networks. We show that the
factorized superposition of images from the training set and the reformulation
of the multi class problem into a multi-label problem yields significantly more
robust networks. The reformulation and the adjustment of the multi class log
loss also improves the results compared to the overlay with only one class as
label.
https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/?p=%2FTNandFDT&mode=list
Related papers
- Revisiting Data Augmentation for Rotational Invariance in Convolutional
Neural Networks [0.29127054707887967]
We investigate how best to include rotational invariance in a CNN for image classification.
Our experiments show that networks trained with data augmentation alone can classify rotated images nearly as well as in the normal unrotated case.
arXiv Detail & Related papers (2023-10-12T15:53:24Z) - Centered Self-Attention Layers [89.21791761168032]
The self-attention mechanism in transformers and the message-passing mechanism in graph neural networks are repeatedly applied.
We show that this application inevitably leads to oversmoothing, i.e., to similar representations at the deeper layers.
We present a correction term to the aggregating operator of these mechanisms.
arXiv Detail & Related papers (2023-06-02T15:19:08Z) - Slimmable Networks for Contrastive Self-supervised Learning [69.9454691873866]
Self-supervised learning makes significant progress in pre-training large models, but struggles with small models.
We introduce another one-stage solution to obtain pre-trained small models without the need for extra teachers.
A slimmable network consists of a full network and several weight-sharing sub-networks, which can be pre-trained once to obtain various networks.
arXiv Detail & Related papers (2022-09-30T15:15:05Z) - Improvements to Gradient Descent Methods for Quantum Tensor Network
Machine Learning [0.0]
We introduce a copy node' method that successfully initializes arbitrary tensor networks.
We present numerical results that show that the combination of techniques presented here produces quantum inspired tensor network models.
arXiv Detail & Related papers (2022-03-03T19:00:40Z) - Enhanced Performance of Pre-Trained Networks by Matched Augmentation
Distributions [10.74023489125222]
We propose a simple solution to address the train-test distributional shift.
We combine results for multiple random crops for a test image.
This not only matches the train time augmentation but also provides the full coverage of the input image.
arXiv Detail & Related papers (2022-01-19T22:33:00Z) - ZerO Initialization: Initializing Residual Networks with only Zeros and
Ones [44.66636787050788]
Deep neural networks are usually with random weights, with adequately selected initial variance to ensure stable signal propagation during training.
There is no consensus on how to select the variance, and this becomes challenging as the number of layers grows.
In this work, we replace the widely used random weight initialization with a fully deterministic initialization scheme ZerO, which initializes residual networks with only zeros and ones.
Surprisingly, we find that ZerO achieves state-of-the-art performance over various image classification datasets, including ImageNet.
arXiv Detail & Related papers (2021-10-25T06:17:33Z) - Distribution Mismatch Correction for Improved Robustness in Deep Neural
Networks [86.42889611784855]
normalization methods increase the vulnerability with respect to noise and input corruptions.
We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer.
In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions.
arXiv Detail & Related papers (2021-10-05T11:36:25Z) - ResMLP: Feedforward networks for image classification with
data-efficient training [73.26364887378597]
We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification.
We will share our code based on the Timm library and pre-trained models.
arXiv Detail & Related papers (2021-05-07T17:31:44Z) - Embedding Propagation: Smoother Manifold for Few-Shot Classification [131.81692677836202]
We propose to use embedding propagation as an unsupervised non-parametric regularizer for manifold smoothing in few-shot classification.
We empirically show that embedding propagation yields a smoother embedding manifold.
We show that embedding propagation consistently improves the accuracy of the models in multiple semi-supervised learning scenarios by up to 16% points.
arXiv Detail & Related papers (2020-03-09T13:51:09Z) - Molecule Property Prediction and Classification with Graph Hypernetworks [113.38181979662288]
We show that the replacement of the underlying networks with hypernetworks leads to a boost in performance.
A major difficulty in the application of hypernetworks is their lack of stability.
A recent work has tackled the training instability of hypernetworks in the context of error correcting codes.
arXiv Detail & Related papers (2020-02-01T16:44:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.