Investigating Neuron Disturbing in Fusing Heterogeneous Neural Networks
- URL: http://arxiv.org/abs/2210.12974v2
- Date: Sun, 29 Oct 2023 02:56:46 GMT
- Title: Investigating Neuron Disturbing in Fusing Heterogeneous Neural Networks
- Authors: Biao Zhang, and Shuqin Zhang
- Abstract summary: In this paper, we reveal the phenomenon of neuron disturbing, where neurons from heterogeneous local models interfere with each other mutually.
We propose an experimental method that excludes neuron disturbing and fuses neural networks via adaptively selecting a local model, called AMS, to execute the prediction.
- Score: 6.389882065284252
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fusing deep learning models trained on separately located clients into a
global model in a one-shot communication round is a straightforward
implementation of Federated Learning. Although current model fusion methods are
shown experimentally valid in fusing neural networks with almost identical
architectures, they are rarely theoretically analyzed. In this paper, we reveal
the phenomenon of neuron disturbing, where neurons from heterogeneous local
models interfere with each other mutually. We give detailed explanations from a
Bayesian viewpoint combining the data heterogeneity among clients and
properties of neural networks. Furthermore, to validate our findings, we
propose an experimental method that excludes neuron disturbing and fuses neural
networks via adaptively selecting a local model, called AMS, to execute the
prediction according to the input. The experiments demonstrate that AMS is more
robust in data heterogeneity than general model fusion and ensemble methods.
This implies the necessity of considering neural disturbing in model fusion.
Besides, AMS is available for fusing models with varying architectures as an
experimental algorithm, and we also list several possible extensions of AMS for
future work.
Related papers
- Diffusion models as probabilistic neural operators for recovering unobserved states of dynamical systems [49.2319247825857]
We show that diffusion-based generative models exhibit many properties favourable for neural operators.
We propose to train a single model adaptable to multiple tasks, by alternating between the tasks during training.
arXiv Detail & Related papers (2024-05-11T21:23:55Z) - Functional Neural Networks: Shift invariant models for functional data
with applications to EEG classification [0.0]
We introduce a new class of neural networks that are shift invariant and preserve smoothness of the data: functional neural networks (FNNs)
For this, we use methods from functional data analysis (FDA) to extend multi-layer perceptrons and convolutional neural networks to functional data.
We show that the models outperform a benchmark model from FDA in terms of accuracy and successfully use FNNs to classify electroencephalography (EEG) data.
arXiv Detail & Related papers (2023-01-14T09:41:21Z) - Understanding Neural Coding on Latent Manifolds by Sharing Features and
Dividing Ensembles [3.625425081454343]
Systems neuroscience relies on two complementary views of neural data, characterized by single neuron tuning curves and analysis of population activity.
These two perspectives combine elegantly in neural latent variable models that constrain the relationship between latent variables and neural activity.
We propose feature sharing across neural tuning curves, which significantly improves performance and leads to better-behaved optimization.
arXiv Detail & Related papers (2022-10-06T18:37:49Z) - EINNs: Epidemiologically-Informed Neural Networks [75.34199997857341]
We introduce a new class of physics-informed neural networks-EINN-crafted for epidemic forecasting.
We investigate how to leverage both the theoretical flexibility provided by mechanistic models as well as the data-driven expressability afforded by AI models.
arXiv Detail & Related papers (2022-02-21T18:59:03Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Fully differentiable model discovery [0.0]
We propose an approach by combining neural network based surrogates with Sparse Bayesian Learning.
Our work expands PINNs to various types of neural network architectures, and connects neural network-based surrogates to the rich field of Bayesian parameter inference.
arXiv Detail & Related papers (2021-06-09T08:11:23Z) - The Neural Coding Framework for Learning Generative Models [91.0357317238509]
We propose a novel neural generative model inspired by the theory of predictive processing in the brain.
In a similar way, artificial neurons in our generative model predict what neighboring neurons will do, and adjust their parameters based on how well the predictions matched reality.
arXiv Detail & Related papers (2020-12-07T01:20:38Z) - Probabilistic Federated Learning of Neural Networks Incorporated with
Global Posterior Information [4.067903810030317]
In federated learning, models trained on local clients are distilled into a global model.
We propose a new method which extends the Probabilistic Federated Neural Matching.
Our new method outperforms popular state-of-the-art federated learning methods in both single communication round and additional communication rounds situation.
arXiv Detail & Related papers (2020-12-06T03:54:58Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Modeling Shared Responses in Neuroimaging Studies through MultiView ICA [94.31804763196116]
Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization.
We propose a novel MultiView Independent Component Analysis model for group studies, where data from each subject are modeled as a linear combination of shared independent sources plus noise.
We demonstrate the usefulness of our approach first on fMRI data, where our model demonstrates improved sensitivity in identifying common sources among subjects.
arXiv Detail & Related papers (2020-06-11T17:29:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.