Exploring explicit coarse-grained structure in artificial neural
networks
- URL: http://arxiv.org/abs/2211.01779v2
- Date: Fri, 4 Nov 2022 07:33:57 GMT
- Title: Exploring explicit coarse-grained structure in artificial neural
networks
- Authors: Xi-Ci Yang, Z. Y. Xie, Xiao-Tao Yang
- Abstract summary: We propose to employ the hierarchical coarse-grained structure in the artificial neural networks explicitly to improve the interpretability without degrading performance.
One is a neural network called TaylorNet, which aims to approximate the general mapping from input data to output result in terms of Taylor series directly.
The other is a new setup for data distillation, which can perform multi-level abstraction of the input dataset and generate new data.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose to employ the hierarchical coarse-grained structure in the
artificial neural networks explicitly to improve the interpretability without
degrading performance. The idea has been applied in two situations. One is a
neural network called TaylorNet, which aims to approximate the general mapping
from input data to output result in terms of Taylor series directly, without
resorting to any magic nonlinear activations. The other is a new setup for data
distillation, which can perform multi-level abstraction of the input dataset
and generate new data that possesses the relevant features of the original
dataset and can be used as references for classification. In both cases, the
coarse-grained structure plays an important role in simplifying the network and
improving both the interpretability and efficiency. The validity has been
demonstrated on MNIST and CIFAR-10 datasets. Further improvement and some open
questions related are also discussed.
Related papers
- NIDS Neural Networks Using Sliding Time Window Data Processing with Trainable Activations and its Generalization Capability [0.0]
This paper presents neural networks for network intrusion detection systems (NIDS) that operate on flow data preprocessed with a time window.
It requires only eleven features which do not rely on deep packet inspection and can be found in most NIDS datasets and easily obtained from conventional flow collectors.
The reported training accuracy exceeds 99% for the proposed method with as little as twenty neural network input features.
arXiv Detail & Related papers (2024-10-24T11:36:19Z) - Linear Mode Connectivity in Sparse Neural Networks [1.30536490219656]
We study how neural network pruning with synthetic data leads to sparse networks with unique training properties.
We find that these properties lead to syntheticworks matching the performance of traditional IMP with up to 150x less training points in settings where distilled data applies.
arXiv Detail & Related papers (2023-10-28T17:51:39Z) - Seeking Interpretability and Explainability in Binary Activated Neural Networks [2.828173677501078]
We study the use of binary activated neural networks as interpretable and explainable predictors in the context of regression tasks.
We present an approach based on the efficient computation of SHAP values for quantifying the relative importance of the features, hidden neurons and even weights.
arXiv Detail & Related papers (2022-09-07T20:11:17Z) - Origami in N dimensions: How feed-forward networks manufacture linear
separability [1.7404865362620803]
We show that a feed-forward architecture has one primary tool at hand to achieve separability: progressive folding of the data manifold in unoccupied higher dimensions.
We argue that an alternative method based on shear, requiring very deep architectures, plays only a small role in real-world networks.
Based on the mechanistic insight, we predict that the progressive generation of separability is necessarily accompanied by neurons showing mixed selectivity and bimodal tuning curves.
arXiv Detail & Related papers (2022-03-21T21:33:55Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Optimization-Based Separations for Neural Networks [57.875347246373956]
We show that gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations.
This is the first optimization-based separation result where the approximation benefits of the stronger architecture provably manifest in practice.
arXiv Detail & Related papers (2021-12-04T18:07:47Z) - ReduNet: A White-box Deep Network from the Principle of Maximizing Rate
Reduction [32.489371527159236]
This work attempts to provide a plausible theoretical framework that aims to interpret modern deep (convolutional) networks from the principles of data compression and discriminative representation.
We show that for high-dimensional multi-class data, the optimal linear discriminative representation maximizes the coding rate difference between the whole dataset and the average of all the subsets.
We show that the basic iterative gradient ascent scheme for optimizing the rate reduction objective naturally leads to a multi-layer deep network, named ReduNet, that shares common characteristics of modern deep networks.
arXiv Detail & Related papers (2021-05-21T16:29:57Z) - Rank-R FNN: A Tensor-Based Learning Model for High-Order Data
Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters.
First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension.
We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z) - Category-Learning with Context-Augmented Autoencoder [63.05016513788047]
Finding an interpretable non-redundant representation of real-world data is one of the key problems in Machine Learning.
We propose a novel method of using data augmentations when training autoencoders.
We train a Variational Autoencoder in such a way, that it makes transformation outcome predictable by auxiliary network.
arXiv Detail & Related papers (2020-10-10T14:04:44Z) - Dual-constrained Deep Semi-Supervised Coupled Factorization Network with
Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net.
To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network.
Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z) - Self-Challenging Improves Cross-Domain Generalization [81.99554996975372]
Convolutional Neural Networks (CNN) conduct image classification by activating dominant features that correlated with labels.
We introduce a simple training, Self-Challenging Representation (RSC), that significantly improves the generalization of CNN to the out-of-domain data.
RSC iteratively challenges the dominant features activated on the training data, and forces the network to activate remaining features that correlates with labels.
arXiv Detail & Related papers (2020-07-05T21:42:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.