Self-Supervised Representation Learning on Neural Network Weights for
Model Characteristic Prediction
- URL: http://arxiv.org/abs/2110.15288v1
- Date: Thu, 28 Oct 2021 16:48:15 GMT
- Title: Self-Supervised Representation Learning on Neural Network Weights for
Model Characteristic Prediction
- Authors: Konstantin Sch\"urholt, Dimche Kostadinov, Damian Borth
- Abstract summary: Self-Supervised Learning (SSL) has been shown to learn useful and information-preserving representations.
We propose to use SSL to learn neural representations of the weights of populations of Neural Networks (NNs)
Our empirical evaluation demonstrates that self-supervised representation learning in this domain is able to recover diverse NN model characteristics.
- Score: 1.9659095632676094
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-Supervised Learning (SSL) has been shown to learn useful and
information-preserving representations. Neural Networks (NNs) are widely
applied, yet their weight space is still not fully understood. Therefore, we
propose to use SSL to learn neural representations of the weights of
populations of NNs. To that end, we introduce domain specific data
augmentations and an adapted attention architecture. Our empirical evaluation
demonstrates that self-supervised representation learning in this domain is
able to recover diverse NN model characteristics. Further, we show that the
proposed learned representations outperform prior work for predicting
hyper-parameters, test accuracy, and generalization gap as well as transfer to
out-of-distribution settings.
Related papers
- Hyper-Representations: Learning from Populations of Neural Networks [3.8979646385036175]
This thesis addresses the challenge of understanding Neural Networks through the lens of their most fundamental component: the weights.
Work in this thesis finds that trained NN models indeed occupy meaningful structures in the weight space, that can be learned and used.
arXiv Detail & Related papers (2024-10-07T15:03:00Z) - Characterizing out-of-distribution generalization of neural networks: application to the disordered Su-Schrieffer-Heeger model [38.79241114146971]
We show how interpretability methods can increase trust in predictions of a neural network trained to classify quantum phases.
In particular, we show that we can ensure better out-of-distribution generalization in the complex classification problem.
This work is an example of how the systematic use of interpretability methods can improve the performance of NNs in scientific problems.
arXiv Detail & Related papers (2024-06-14T13:24:32Z) - Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - ConCerNet: A Contrastive Learning Based Framework for Automated
Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling.
We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - FF-NSL: Feed-Forward Neural-Symbolic Learner [70.978007919101]
This paper introduces a neural-symbolic learning framework, called Feed-Forward Neural-Symbolic Learner (FF-NSL)
FF-NSL integrates state-of-the-art ILP systems based on the Answer Set semantics, with neural networks, in order to learn interpretable hypotheses from labelled unstructured data.
arXiv Detail & Related papers (2021-06-24T15:38:34Z) - Locally Sparse Networks for Interpretable Predictions [7.362415721170984]
We propose a framework for training locally sparse neural networks where the local sparsity is learned via a sample-specific gating mechanism.
The sample-specific sparsity is predicted via a textitgating network, which is trained in tandem with the textitprediction network.
We demonstrate that our method outperforms state-of-the-art models when predicting the target function with far fewer features per instance.
arXiv Detail & Related papers (2021-06-11T15:46:50Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - Learning Semantically Meaningful Features for Interpretable
Classifications [17.88784870849724]
SemCNN learns associations between visual features and word phrases.
Experiment results on multiple benchmark datasets demonstrate that SemCNN can learn features with clear semantic meaning.
arXiv Detail & Related papers (2021-01-11T14:35:16Z) - Neural Networks Enhancement with Logical Knowledge [83.9217787335878]
We propose an extension of KENN for relational data.
The results show that KENN is capable of increasing the performances of the underlying neural network even in the presence relational data.
arXiv Detail & Related papers (2020-09-13T21:12:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.