Related papers: DeepNNK: Explaining deep models and their generalization using polytope interpolation

DeepNNK: Explaining deep models and their generalization using polytope interpolation

URL: http://arxiv.org/abs/2007.10505v1
Date: Mon, 20 Jul 2020 22:05:24 GMT
Title: DeepNNK: Explaining deep models and their generalization using polytope interpolation
Authors: Sarath Shekkizhar, Antonio Ortega
Abstract summary: We take a step towards better understanding of neural networks by introducing a local polytopegenerative method. The proposed Deep Non Negative Kernel regression (NNK) framework is nongenerative, theoretically simple and geometrically intuitive.
Score: 42.16401154367232
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern machine learning systems based on neural networks have shown great success in learning complex data patterns while being able to make good predictions on unseen data points. However, the limited interpretability of these systems hinders further progress and application to several domains in the real world. This predicament is exemplified by time consuming model selection and the difficulties faced in predictive explainability, especially in the presence of adversarial examples. In this paper, we take a step towards better understanding of neural networks by introducing a local polytope interpolation method. The proposed Deep Non Negative Kernel regression (NNK) interpolation framework is non parametric, theoretically simple and geometrically intuitive. We demonstrate instance based explainability for deep learning models and develop a method to identify models with good generalization properties using leave one out estimation. Finally, we draw a rationalization to adversarial and generative examples which are inevitable from an interpolation view of machine learning.

Related papers

Nonlinear classification of neural manifolds with contextual information [6.292933471495322]
manifold capacity has emerged as a promising framework linking population geometry to the separability of neural manifold. We propose a theoretical framework that overcomes this limitation by leveraging contextual input information. Our framework's increased expressivity captures representation untanglement in deep networks at early stages of the layer hierarchy, previously inaccessible to analysis.
arXiv Detail & Related papers (2024-05-10T23:37:31Z)
MMGP: a Mesh Morphing Gaussian Process-based machine learning method for regression of physical problems under non-parameterized geometrical variability [0.30693357740321775]
We propose a machine learning method that do not rely on graph neural networks. The proposed methodology can easily deal with large meshes without the need for explicit shape parameterization.
arXiv Detail & Related papers (2023-05-22T09:50:15Z)
Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data. Main aim of the identified model is to predict new data from previous observations. We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z)
Reliable extrapolation of deep neural operators informed by physics or sparse observations [2.887258133992338]
Deep neural operators can learn nonlinear mappings between infinite-dimensional function spaces via deep neural networks. DeepONets provide a new simulation paradigm in science and engineering. We propose five reliable learning methods that guarantee a safe prediction under extrapolation.
arXiv Detail & Related papers (2022-12-13T03:02:46Z)
Learning Low Dimensional State Spaces with Overparameterized Recurrent Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory. Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z)
Gaussian Process Surrogate Models for Neural Networks [6.8304779077042515]
In science and engineering, modeling is a methodology used to understand complex systems whose internal processes are opaque. We construct a class of surrogate models for neural networks using Gaussian processes. We demonstrate our approach captures existing phenomena related to the spectral bias of neural networks, and then show that our surrogate models can be used to solve practical problems.
arXiv Detail & Related papers (2022-08-11T20:17:02Z)
An Information-Theoretic Framework for Supervised Learning [22.280001450122175]
We propose a novel information-theoretic framework with its own notions of regret and sample complexity. We study the sample complexity of learning from data generated by deep neural networks with ReLU activation units. We conclude by corroborating our theoretical results with experimental analysis of random single-hidden-layer neural networks.
arXiv Detail & Related papers (2022-03-01T05:58:28Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks. Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair. A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z)
Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques. Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance. We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.