SPINE: Soft Piecewise Interpretable Neural Equations
- URL: http://arxiv.org/abs/2111.10622v1
- Date: Sat, 20 Nov 2021 16:18:00 GMT
- Title: SPINE: Soft Piecewise Interpretable Neural Equations
- Authors: Jasdeep Singh Grover, Harsh Minesh Domadia, Raj Anant Tapase and
Grishma Sharma
- Abstract summary: Fully connected networks are ubiquitous but uninterpretable.
This paper takes a novel approach to piecewise fits by using set operations on individual pieces(parts)
It can find a variety of applications where fully connected layers must be replaced by interpretable layers.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Relu Fully Connected Networks are ubiquitous but uninterpretable because they
fit piecewise linear functions emerging from multi-layered structures and
complex interactions of model weights. This paper takes a novel approach to
piecewise fits by using set operations on individual pieces(parts). This is
done by approximating canonical normal forms and using the resultant as a
model. This gives special advantages like (a)strong correspondence of
parameters to pieces of the fit function(High Interpretability); (b)ability to
fit any combination of continuous functions as pieces of the piecewise
function(Ease of Design); (c)ability to add new non-linearities in a targeted
region of the domain(Targeted Learning); (d)simplicity of an equation which
avoids layering. It can also be expressed in the general max-min representation
of piecewise linear functions which gives theoretical ease and credibility.
This architecture is tested on simulated regression and classification tasks
and benchmark datasets including UCI datasets, MNIST, FMNIST, and CIFAR 10.
This performance is on par with fully connected architectures. It can find a
variety of applications where fully connected layers must be replaced by
interpretable layers.
Related papers
- ReLU Neural Networks with Linear Layers are Biased Towards Single- and Multi-Index Models [9.96121040675476]
This manuscript explores how properties of functions learned by neural networks of depth greater than two layers affect predictions.
Our framework considers a family of networks of varying depths that all have the same capacity but different representation costs.
arXiv Detail & Related papers (2023-05-24T22:10:12Z) - Equivariant Architectures for Learning in Deep Weight Spaces [54.61765488960555]
We present a novel network architecture for learning in deep weight spaces.
It takes as input a concatenation of weights and biases of a pre-trainedvariant.
We show how these layers can be implemented using three basic operations.
arXiv Detail & Related papers (2023-01-30T10:50:33Z) - Equivariance with Learned Canonicalization Functions [77.32483958400282]
We show that learning a small neural network to perform canonicalization is better than using predefineds.
Our experiments show that learning the canonicalization function is competitive with existing techniques for learning equivariant functions across many tasks.
arXiv Detail & Related papers (2022-11-11T21:58:15Z) - Functional Indirection Neural Estimator for Better Out-of-distribution
Generalization [27.291114360472243]
FINE (Functional Indirection Neural Estorimator) learns to compose functions that map data input to output on-the-fly.
We train FINE and competing models on IQ tasks using images from the MNIST, Omniglot and CIFAR100 datasets.
FINE not only achieves the best performance on all tasks but also is able to adapt to small-scale data scenarios.
arXiv Detail & Related papers (2022-10-23T14:43:02Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Understanding Dynamics of Nonlinear Representation Learning and Its
Application [12.697842097171119]
We study the dynamics of implicit nonlinear representation learning.
We show that the data-architecture alignment condition is sufficient for the global convergence.
We derive a new training framework, which satisfies the data-architecture alignment condition without assuming it.
arXiv Detail & Related papers (2021-06-28T16:31:30Z) - Compressing Deep ODE-Nets using Basis Function Expansions [105.05435207079759]
We consider formulations of the weights as continuous-depth functions using linear combinations of basis functions.
This perspective allows us to compress the weights through a change of basis, without retraining, while maintaining near state-of-the-art performance.
In turn, both inference time and the memory footprint are reduced, enabling quick and rigorous adaptation between computational environments.
arXiv Detail & Related papers (2021-06-21T03:04:51Z) - Linear Iterative Feature Embedding: An Ensemble Framework for
Interpretable Model [6.383006473302968]
A new ensemble framework for interpretable model called Linear Iterative Feature Embedding (LIFE) has been developed.
LIFE is able to fit a wide single-hidden-layer neural network (NN) accurately with three steps.
LIFE consistently outperforms directly trained single-hidden-layer NNs and also outperforms many other benchmark models.
arXiv Detail & Related papers (2021-03-18T02:01:17Z) - Learning Aggregation Functions [78.47770735205134]
We introduce LAF (Learning Aggregation Functions), a learnable aggregator for sets of arbitrary cardinality.
We report experiments on semi-synthetic and real data showing that LAF outperforms state-of-the-art sum- (max-) decomposition architectures.
arXiv Detail & Related papers (2020-12-15T18:28:53Z) - Non-Euclidean Universal Approximation [4.18804572788063]
Modifications to a neural network's input and output layers are often required to accommodate the specificities of most practical learning tasks.
We present general conditions describing feature and readout maps that preserve an architecture's ability to approximate any continuous functions uniformly on compacts.
arXiv Detail & Related papers (2020-06-03T15:38:57Z) - Evolving Normalization-Activation Layers [100.82879448303805]
We develop efficient rejection protocols to quickly filter out candidate layers that do not work well.
Our method leads to the discovery of EvoNorms, a set of new normalization-activation layers with novel, and sometimes surprising structures.
Our experiments show that EvoNorms work well on image classification models including ResNets, MobileNets and EfficientNets.
arXiv Detail & Related papers (2020-04-06T19:52:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.