DeepNNK: Explaining deep models and their generalization using polytope
interpolation
- URL: http://arxiv.org/abs/2007.10505v1
- Date: Mon, 20 Jul 2020 22:05:24 GMT
- Title: DeepNNK: Explaining deep models and their generalization using polytope
interpolation
- Authors: Sarath Shekkizhar, Antonio Ortega
- Abstract summary: We take a step towards better understanding of neural networks by introducing a local polytopegenerative method.
The proposed Deep Non Negative Kernel regression (NNK) framework is nongenerative, theoretically simple and geometrically intuitive.
- Score: 42.16401154367232
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern machine learning systems based on neural networks have shown great
success in learning complex data patterns while being able to make good
predictions on unseen data points. However, the limited interpretability of
these systems hinders further progress and application to several domains in
the real world. This predicament is exemplified by time consuming model
selection and the difficulties faced in predictive explainability, especially
in the presence of adversarial examples. In this paper, we take a step towards
better understanding of neural networks by introducing a local polytope
interpolation method. The proposed Deep Non Negative Kernel regression (NNK)
interpolation framework is non parametric, theoretically simple and
geometrically intuitive. We demonstrate instance based explainability for deep
learning models and develop a method to identify models with good
generalization properties using leave one out estimation. Finally, we draw a
rationalization to adversarial and generative examples which are inevitable
from an interpolation view of machine learning.
Related papers
- Nonlinear classification of neural manifolds with contextual information [6.292933471495322]
manifold capacity has emerged as a promising framework linking population geometry to the separability of neural manifold.
We propose a theoretical framework that overcomes this limitation by leveraging contextual input information.
Our framework's increased expressivity captures representation untanglement in deep networks at early stages of the layer hierarchy, previously inaccessible to analysis.
arXiv Detail & Related papers (2024-05-10T23:37:31Z) - Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data.
Main aim of the identified model is to predict new data from previous observations.
We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z) - Reliable extrapolation of deep neural operators informed by physics or
sparse observations [2.887258133992338]
Deep neural operators can learn nonlinear mappings between infinite-dimensional function spaces via deep neural networks.
DeepONets provide a new simulation paradigm in science and engineering.
We propose five reliable learning methods that guarantee a safe prediction under extrapolation.
arXiv Detail & Related papers (2022-12-13T03:02:46Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Gaussian Process Surrogate Models for Neural Networks [6.8304779077042515]
In science and engineering, modeling is a methodology used to understand complex systems whose internal processes are opaque.
We construct a class of surrogate models for neural networks using Gaussian processes.
We demonstrate our approach captures existing phenomena related to the spectral bias of neural networks, and then show that our surrogate models can be used to solve practical problems.
arXiv Detail & Related papers (2022-08-11T20:17:02Z) - An Information-Theoretic Framework for Supervised Learning [22.280001450122175]
We propose a novel information-theoretic framework with its own notions of regret and sample complexity.
We study the sample complexity of learning from data generated by deep neural networks with ReLU activation units.
We conclude by corroborating our theoretical results with experimental analysis of random single-hidden-layer neural networks.
arXiv Detail & Related papers (2022-03-01T05:58:28Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques.
Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance.
We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.