Related papers: Automatic Organization of Neural Modules for Enhanced Collaboration in Neural Networks

Automatic Organization of Neural Modules for Enhanced Collaboration in Neural Networks

URL: http://arxiv.org/abs/2005.04088v4
Date: Wed, 01 Jan 2025 11:47:39 GMT
Title: Automatic Organization of Neural Modules for Enhanced Collaboration in Neural Networks
Authors: Xinshun Liu, Yizhi Fang, Yichao Jiang,
Abstract summary: This work proposes a new perspective on the structure of Neural Networks (NNs)<n>Traditional NNs are typically tree-like structures for convenience.<n>We introduce a synchronous graph-based structure to establish a novel way of organizing the neural units: the Neural Modules.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: This work proposes a new perspective on the structure of Neural Networks (NNs). Traditional Neural Networks are typically tree-like structures for convenience, which can be predefined or learned by NAS methods. However, such a structure can not facilitate communications between nodes at the same level or signal transmissions to previous levels. These defects prevent effective collaboration, restricting the capabilities of neural networks. It is well-acknowledged that the biological neural system contains billions of neural units. Their connections are far more complicated than the current NN structure. To enhance the representational ability of neural networks, existing works try to increase the depth of the neural network and introduce more parameters. However, they all have limitations with constrained parameters. In this work, we introduce a synchronous graph-based structure to establish a novel way of organizing the neural units: the Neural Modules. This framework allows any nodes to communicate with each other and encourages neural units to work collectively, demonstrating a departure from the conventional constrained paradigm. Such a structure also provides more candidates for the NAS methods. Furthermore, we also propose an elegant regularization method to organize neural units into multiple independent, balanced neural modules systematically. This would be convenient for handling these neural modules in parallel. Compared to traditional NNs, our method unlocks the potential of NNs from tree-like structures to general graphs and makes NNs be optimized in an almost complete set. Our approach proves adaptable to diverse tasks, offering compatibility across various scenarios. Quantitative experimental results substantiate the potential of our structure, indicating the improvement of NNs.

Related papers

First-Order Manifold Data Augmentation for Regression Learning [4.910937238451485]
We introduce FOMA: a new data-driven domain-independent data augmentation method. We evaluate FOMA on in-distribution generalization and out-of-distribution benchmarks, and we show that it improves the generalization of several neural architectures.
arXiv Detail & Related papers (2024-06-16T12:35:05Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
Transforming to Yoked Neural Networks to Improve ANN Structure [0.0]
Most existing artificial neural networks (ANN) are designed as a tree structure to imitate neural networks. We propose a model YNN to efficiently eliminate such structural bias. In our model, nodes also carry out aggregation and transformation of features, and edges determine the flow of information.
arXiv Detail & Related papers (2023-06-03T16:56:18Z)
Biologically inspired structure learning with reverse knowledge distillation for spiking neural networks [19.33517163587031]
Spiking neural networks (SNNs) have superb characteristics in sensory information recognition tasks due to their biological plausibility. The performance of some current spiking-based models is limited by their structures which means either fully connected or too-deep structures bring too much redundancy. This paper proposes an evolutionary-based structure construction method for constructing more reasonable SNNs.
arXiv Detail & Related papers (2023-04-19T08:41:17Z)
SALUDA: Surface-based Automotive Lidar Unsupervised Domain Adaptation [62.889835139583965]
We introduce an unsupervised auxiliary task of learning an implicit underlying surface representation simultaneously on source and target data. As both domains share the same latent representation, the model is forced to accommodate discrepancies between the two sources of data. Our experiments demonstrate that our method achieves a better performance than the current state of the art, both in real-to-real and synthetic-to-real scenarios.
arXiv Detail & Related papers (2023-04-06T17:36:23Z)
Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks. We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order. In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z)
Improving Domain Generalization with Domain Relations [77.63345406973097]
This paper focuses on domain shifts, which occur when the model is applied to new domains that are different from the ones it was trained on. We propose a new approach called D$3$G to learn domain-specific models. Our results show that D$3$G consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-02-06T08:11:16Z)
Multi-Domain Long-Tailed Learning by Augmenting Disentangled Representations [80.76164484820818]
There is an inescapable long-tailed class-imbalance issue in many real-world classification problems. We study this multi-domain long-tailed learning problem and aim to produce a model that generalizes well across all classes and domains. Built upon a proposed selective balanced sampling strategy, TALLY achieves this by mixing the semantic representation of one example with the domain-associated nuisances of another.
arXiv Detail & Related papers (2022-10-25T21:54:26Z)
Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net Study [55.12108376616355]
The study on NTK has been devoted to typical neural network architectures, but is incomplete for neural networks with Hadamard products (NNs-Hp) In this work, we derive the finite-width-K formulation for a special class of NNs-Hp, i.e., neural networks. We prove their equivalence to the kernel regression predictor with the associated NTK, which expands the application scope of NTK.
arXiv Detail & Related papers (2022-09-16T06:36:06Z)
The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift [127.21287240963859]
We investigate a transfer learning approach with pretraining on the source data and finetuning based on the target data. For a large class of linear regression instances, transfer learning with $O(N2)$ source data is as effective as supervised learning with $N$ target data.
arXiv Detail & Related papers (2022-08-03T05:59:49Z)
Random Graph-Based Neuromorphic Learning with a Layer-Weaken Structure [4.477401614534202]
We transform the random graph theory into an NN model with practical meaning and based on clarifying the input-output relationship of each neuron. Under the usage of this low-operation cost approach, neurons are assigned to several groups of which connection relationships can be regarded as uniform representations of random graphs they belong to. We develop a joint classification mechanism involving information interaction between multiple RGNNs and realize significant performance improvements in supervised learning for three benchmark tasks.
arXiv Detail & Related papers (2021-11-17T03:37:06Z)
Domain-Class Correlation Decomposition for Generalizable Person Re-Identification [34.813965300584776]
In person re-identification, the domain and class are correlated. We show that domain adversarial learning will lose certain information about class due to this domain-class correlation. Our model outperforms the state-of-the-art methods on the large-scale domain generalization Re-ID benchmark.
arXiv Detail & Related papers (2021-06-29T09:45:03Z)
Quantifying and Improving Transferability in Domain Generalization [53.16289325326505]
Out-of-distribution generalization is one of the key challenges when transferring a model from the lab to the real world. We formally define transferability that one can quantify and compute in domain generalization. We propose a new algorithm for learning transferable features and test it over various benchmark datasets.
arXiv Detail & Related papers (2021-06-07T14:04:32Z)
An Improved Transfer Model: Randomized Transferable Machine [32.50263074872975]
We propose a new transfer model called Randomized Transferable Machine (RTM) to handle small divergence of domains. Specifically, we work on the new source and target data learned from existing feature-based transfer methods. In principle, the more corruptions are made, the higher the probability of the new target data can be covered by the constructed source data populations.
arXiv Detail & Related papers (2020-11-27T09:37:01Z)
Exploiting Heterogeneity in Operational Neural Networks by Synaptic Plasticity [87.32169414230822]
Recently proposed network model, Operational Neural Networks (ONNs), can generalize the conventional Convolutional Neural Networks (CNNs) In this study the focus is drawn on searching the best-possible operator set(s) for the hidden neurons of the network based on the Synaptic Plasticity paradigm that poses the essential learning theory in biological neurons. Experimental results over highly challenging problems demonstrate that the elite ONNs even with few neurons and layers can achieve a superior learning performance than GIS-based ONNs.
arXiv Detail & Related papers (2020-08-21T19:03:23Z)
Locality Guided Neural Networks for Explainable Artificial Intelligence [12.435539489388708]
We propose a novel algorithm for back propagation, called Locality Guided Neural Network(LGNN) LGNN preserves locality between neighbouring neurons within each layer of a deep network. In our experiments, we train various VGG and Wide ResNet (WRN) networks for image classification on CIFAR100.
arXiv Detail & Related papers (2020-07-12T23:45:51Z)
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs) In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit. We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation [102.67010690592011]
Unsupervised adaptationUDA (UDA) aims to leverage the knowledge learned from a labeled source dataset to solve similar tasks in a new unlabeled domain. Prior UDA methods typically require to access the source data when learning to adapt the model. This work tackles a practical setting where only a trained source model is available and how we can effectively utilize such a model without source data to solve UDA problems.
arXiv Detail & Related papers (2020-02-20T03:13:58Z)
Neural Rule Ensembles: Encoding Sparse Feature Interactions into Neural Networks [3.7277730514654555]
We use decision trees to capture relevant features and their interactions and define a mapping to encode extracted relationships into a neural network. At the same time through feature selection it enables learning of compact representations compared to state of the art tree-based approaches.
arXiv Detail & Related papers (2020-02-11T11:22:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.