Related papers: How and what to learn:The modes of machine learning

How and what to learn:The modes of machine learning

URL: http://arxiv.org/abs/2202.13829v1
Date: Mon, 28 Feb 2022 14:39:06 GMT
Title: How and what to learn:The modes of machine learning
Authors: Sihan Feng, Yong Zhang, Fuming Wang, Hong Zhao
Abstract summary: We propose a new approach, namely the weight pathway analysis (WPA), to study the mechanism of multilayer neural networks. WPA shows that a neural network stores and utilizes information in a "holographic" way, that is, the network encodes all training samples in a coherent structure. It is found that hidden-layer neurons self-organize into different classes in the later stages of the learning process.
Score: 7.085027463060304
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We proposal a new approach, namely the weight pathway analysis (WPA), to study the mechanism of multilayer neural networks. The weight pathways linking neurons longitudinally from input neurons to output neurons are considered as the basic units of a neural network. We decompose a neural network into a series of subnetworks of weight pathways, and establish characteristic maps for these subnetworks. The parameters of a characteristic map can be visualized, providing a longitudinal perspective of the network and making the neural network explainable. Using WPA, we discover that a neural network stores and utilizes information in a "holographic" way, that is, the network encodes all training samples in a coherent structure. An input vector interacts with this "holographic" structure to enhance or suppress each subnetwork which working together to produce the correct activities in the output neurons to recognize the input sample. Furthermore, with WPA, we reveal fundamental learning modes of a neural network: the linear learning mode and the nonlinear learning mode. The former extracts linearly separable features while the latter extracts linearly inseparable features. It is found that hidden-layer neurons self-organize into different classes in the later stages of the learning process. It is further discovered that the key strategy to improve the performance of a neural network is to control the ratio of the two learning modes to match that of the linear and the nonlinear features, and that increasing the width or the depth of a neural network helps this ratio controlling process. This provides theoretical ground for the practice of optimizing a neural network via increasing its width or its depth. The knowledge gained with WPA enables us to understand the fundamental questions such as what to learn, how to learn, and how can learn well.

Related papers

Concept-Guided Interpretability via Neural Chunking [64.6429903327095]
We show that neural networks exhibit patterns in their raw population activity that mirror regularities in the training data.<n>We propose three methods to extract recurring chunks on a neural population level.<n>Our work points to a new direction for interpretability, one that harnesses both cognitive principles and the structure of naturalistic data.
arXiv Detail & Related papers (2025-05-16T13:49:43Z)
Unsupervised representation learning with Hebbian synaptic and structural plasticity in brain-like feedforward neural networks [0.0]
We introduce and evaluate a brain-like neural network model capable of unsupervised representation learning. The model was tested on a diverse set of popular machine learning benchmarks.
arXiv Detail & Related papers (2024-06-07T08:32:30Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks [74.3099028063756]
We develop a new method with neuronal operations based on lateral connections and Hebbian learning. We show that Hebbian and anti-Hebbian learning on recurrent lateral connections can effectively extract the principal subspace of neural activities. Our method consistently solves for spiking neural networks with nearly zero forgetting.
arXiv Detail & Related papers (2024-02-19T09:29:37Z)
A Sparse Quantized Hopfield Network for Online-Continual Memory [0.0]
Nervous systems learn online where a stream of noisy data points are presented in a non-independent, identically distributed (non-i.i.d.) way. Deep networks, on the other hand, typically use non-local learning algorithms and are trained in an offline, non-noisy, i.i.d. setting. We implement this kind of model in a novel neural network called the Sparse Quantized Hopfield Network (SQHN)
arXiv Detail & Related papers (2023-07-27T17:46:17Z)
When Deep Learning Meets Polyhedral Theory: A Survey [6.899761345257773]
In the past decade, deep became the prevalent methodology for predictive modeling thanks to the remarkable accuracy of deep neural learning. Meanwhile, the structure of neural networks converged back to simplerwise and linear functions.
arXiv Detail & Related papers (2023-04-29T11:46:53Z)
Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption. They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware. A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
A biologically plausible neural network for multi-channel Canonical Correlation Analysis [12.940770779756482]
Cortical pyramidal neurons receive inputs from multiple neural populations and integrate these inputs in separate dendritic compartments. We seek a multi-channel CCA algorithm that can be implemented in a biologically plausible neural network. For biological plausibility, we require that the network operates in the online setting and its synaptic update rules are local.
arXiv Detail & Related papers (2020-10-01T16:17:53Z)
Towards Understanding Hierarchical Learning: Benefits of Neural Representations [160.33479656108926]
In this work, we demonstrate that intermediate neural representations add more flexibility to neural networks. We show that neural representation can achieve improved sample complexities compared with the raw input. Our results characterize when neural representations are beneficial, and may provide a new perspective on why depth is important in deep learning.
arXiv Detail & Related papers (2020-06-24T02:44:54Z)
Neural Rule Ensembles: Encoding Sparse Feature Interactions into Neural Networks [3.7277730514654555]
We use decision trees to capture relevant features and their interactions and define a mapping to encode extracted relationships into a neural network. At the same time through feature selection it enables learning of compact representations compared to state of the art tree-based approaches.
arXiv Detail & Related papers (2020-02-11T11:22:20Z)
Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy. We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.