Related papers: Towards a Phenomenological Understanding of Neural Networks: Data

Towards a Phenomenological Understanding of Neural Networks: Data

URL: http://arxiv.org/abs/2305.00995v1
Date: Mon, 1 May 2023 18:00:01 GMT
Title: Towards a Phenomenological Understanding of Neural Networks: Data
Authors: Samuel Tovey, Sven Krippendorf, Konstantin Nikolaou, Christian Holm
Abstract summary: Theory of neural networks (NNs) built upon collective variables would provide scientists with the tools to better understand the learning process at every stage. In this work, we introduce two such variables, the entropy and the trace of the empirical neural tangent kernel (NTK) built on the training data passed to the model. We find correlation between the starting entropy, the trace of the NTK, and the generalization of the model computed after training is complete.
Score: 1.2985510601654955
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A theory of neural networks (NNs) built upon collective variables would provide scientists with the tools to better understand the learning process at every stage. In this work, we introduce two such variables, the entropy and the trace of the empirical neural tangent kernel (NTK) built on the training data passed to the model. We empirically analyze the NN performance in the context of these variables and find that there exists correlation between the starting entropy, the trace of the NTK, and the generalization of the model computed after training is complete. This framework is then applied to the problem of optimal data selection for the training of NNs. To this end, random network distillation (RND) is used as a means of selecting training data which is then compared with random selection of data. It is shown that not only does RND select data-sets capable of outperforming random selection, but that the collective variables associated with the RND data-sets are larger than those of the randomly selected sets. The results of this investigation provide a stable ground from which the selection of data for NN training can be driven by this phenomenological framework.

Related papers

Unveiling the Power of Sparse Neural Networks for Feature Selection [60.50319755984697]
Sparse Neural Networks (SNNs) have emerged as powerful tools for efficient feature selection. We show that SNNs trained with dynamic sparse training (DST) algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction. Our findings show that feature selection with SNNs trained with DST algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
arXiv Detail & Related papers (2024-08-08T16:48:33Z)
Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs. We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD. We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z)
Iterative self-transfer learning: A general methodology for response time-history prediction based on small dataset [0.0]
An iterative self-transfer learningmethod for training neural networks based on small datasets is proposed in this study. The results show that the proposed method can improve the model performance by near an order of magnitude on small datasets.
arXiv Detail & Related papers (2023-06-14T18:48:04Z)
How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series. We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z)
Neural ODEs with Irregular and Noisy Data [8.349349605334316]
We discuss a methodology to learn differential equation(s) using noisy and irregular sampled measurements. In our methodology, the main innovation can be seen in the integration of deep neural networks with the neural ordinary differential equations (ODEs) approach. The proposed framework to learn a model describing the vector field is highly effective under noisy measurements.
arXiv Detail & Related papers (2022-05-19T11:24:41Z)
MLDS: A Dataset for Weight-Space Analysis of Neural Networks [0.0]
We present MLDS, a new dataset consisting of thousands of trained neural networks with carefully controlled parameters. This dataset enables new insights into both model-to-model and model-to-training-data relationships.
arXiv Detail & Related papers (2021-04-21T14:24:26Z)
Statistical model-based evaluation of neural networks [74.10854783437351]
We develop an experimental setup for the evaluation of neural networks (NNs) The setup helps to benchmark a set of NNs vis-a-vis minimum-mean-square-error (MMSE) performance bounds. This allows us to test the effects of training data size, data dimension, data geometry, noise, and mismatch between training and testing conditions.
arXiv Detail & Related papers (2020-11-18T00:33:24Z)
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs) In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit. We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)
Bayesian Graph Neural Networks with Adaptive Connection Sampling [62.51689735630133]
We propose a unified framework for adaptive connection sampling in graph neural networks (GNNs) The proposed framework not only alleviates over-smoothing and over-fitting tendencies of deep GNNs, but also enables learning with uncertainty in graph analytic tasks with GNNs.
arXiv Detail & Related papers (2020-06-07T07:06:35Z)
A study of local optima for learning feature interactions using neural networks [4.94950858749529]
We study the datastarved regime where a NN is trained on a relatively small amount of training data. We experimentally observed that the cross-entropy loss function on XOR-like data has many non-equivalent local optima. We show that the performance of a NN on real datasets can be improved using pruning.
arXiv Detail & Related papers (2020-02-11T11:38:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.