Related papers: Geometry and Generalization: Eigenvalues as predictors of where a network will fail to generalize

Geometry and Generalization: Eigenvalues as predictors of where a network will fail to generalize

URL: http://arxiv.org/abs/2107.06386v1
Date: Tue, 13 Jul 2021 21:03:42 GMT
Title: Geometry and Generalization: Eigenvalues as predictors of where a network will fail to generalize
Authors: Susama Agarwala, Benjamin Dees, Andrew Gearhart, Corey Lowman
Abstract summary: We study the deformation of the input space by a trained autoencoder via the Jacobians of the trained weight matrices. This is a dataset independent means of testing an autoencoder's ability to generalize on new input.
Score: 0.30586855806896046
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study the deformation of the input space by a trained autoencoder via the Jacobians of the trained weight matrices. In doing so, we prove bounds for the mean squared errors for points in the input space, under assumptions regarding the orthogonality of the eigenvectors. We also show that the trace and the product of the eigenvalues of the Jacobian matrices is a good predictor of the MSE on test points. This is a dataset independent means of testing an autoencoder's ability to generalize on new input. Namely, no knowledge of the dataset on which the network was trained is needed, only the parameters of the trained model.

Related papers

Locating Information in Large Language Models via Random Matrix Theory [0.0]
We analyze the weight matrices of pretrained transformer models BERT and Llama. deviations emerge after training, allowing us to locate learned structures within the models. Our findings reveal that, after fine-tuning, small singular values play a crucial role in the models' capabilities.
arXiv Detail & Related papers (2024-10-23T11:19:08Z)
Symmetry Discovery for Different Data Types [52.2614860099811]
Equivariant neural networks incorporate symmetries into their architecture, achieving higher generalization performance. We propose LieSD, a method for discovering symmetries via trained neural networks which approximate the input-output mappings of the tasks. We validate the performance of LieSD on tasks with symmetries such as the two-body problem, the moment of inertia matrix prediction, and top quark tagging.
arXiv Detail & Related papers (2024-10-13T13:39:39Z)
Support matrix machine: A review [0.0]
Support matrix machine (SMM) represents one of the emerging methodologies tailored for handling matrix input data. This article provides the first in-depth analysis of the development of the SMM model. We discuss numerous SMM variants, such as robust, sparse, class imbalance, and multi-class classification models.
arXiv Detail & Related papers (2023-10-30T16:46:23Z)
Grokking in Linear Estimators -- A Solvable Model that Groks without Understanding [1.1510009152620668]
Grokking is where a model learns to generalize long after it has fit the training data. We show analytically and numerically that grokking can surprisingly occur in linear networks performing linear tasks.
arXiv Detail & Related papers (2023-10-25T08:08:44Z)
Householder Projector for Unsupervised Latent Semantics Discovery [58.92485745195358]
Householder Projector helps StyleGANs to discover more disentangled and precise semantic attributes without sacrificing image fidelity. We integrate our projector into pre-trained StyleGAN2/StyleGAN3 and evaluate the models on several benchmarks.
arXiv Detail & Related papers (2023-07-16T11:43:04Z)
Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space. We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z)
Random matrix analysis of deep neural network weight matrices [0.0]
We study the weight matrices of trained deep neural networks using methods from random matrix theory (RMT) We show that the statistics of most of the singular values follow universal RMT predictions. This suggests that they are random and do not contain system specific information.
arXiv Detail & Related papers (2022-03-28T11:22:12Z)
Eigenvalues of Autoencoders in Training and at Initialization [0.2578242050187029]
We study the distribution of eigenvalues of Jacobian matrices of autoencoders early in the training process. We find that autoencoders that have not been trained have eigenvalue distributions that are qualitatively different from those which have been trained for a long time.
arXiv Detail & Related papers (2022-01-27T21:34:49Z)
Test Set Sizing Via Random Matrix Theory [91.3755431537592]
This paper uses techniques from Random Matrix Theory to find the ideal training-testing data split for a simple linear regression. It defines "ideal" as satisfying the integrity metric, i.e. the empirical model error is the actual measurement noise. This paper is the first to solve for the training and test size for any model in a way that is truly optimal.
arXiv Detail & Related papers (2021-12-11T13:18:33Z)
Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency. We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z)
Eigendecomposition-Free Training of Deep Networks for Linear Least-Square Problems [107.3868459697569]
We introduce an eigendecomposition-free approach to training a deep network. We show that our approach is much more robust than explicit differentiation of the eigendecomposition. Our method has better convergence properties and yields state-of-the-art results.
arXiv Detail & Related papers (2020-04-15T04:29:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.