Related papers: Multi Part Deployment of Neural Network

Multi Part Deployment of Neural Network

URL: http://arxiv.org/abs/2506.01387v1
Date: Mon, 02 Jun 2025 07:24:29 GMT
Title: Multi Part Deployment of Neural Network
Authors: Paritosh Ranjan, Surajit Majumder, Prodip Roy,
Abstract summary: This paper proposes a distributed system architecture that partitions a neural network across multiple servers.<n>A Multi-Part Neural Network Execution Engine facilitates seamless execution and training across distributed partitions.<n>A Neuron Distributor module enables flexible partitioning strategies based on neuron count, percentage, identifiers, or network layers.
Score: 0.17205106391379024
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The increasing scale of modern neural networks, exemplified by architectures from IBM (530 billion neurons) and Google (500 billion parameters), presents significant challenges in terms of computational cost and infrastructure requirements. As deep neural networks continue to grow, traditional training paradigms relying on monolithic GPU clusters become increasingly unsustainable. This paper proposes a distributed system architecture that partitions a neural network across multiple servers, each responsible for a subset of neurons. Neurons are classified as local or remote, with inter-server connections managed via a metadata-driven lookup mechanism. A Multi-Part Neural Network Execution Engine facilitates seamless execution and training across distributed partitions by dynamically resolving and invoking remote neurons using stored metadata. All servers share a unified model through a network file system (NFS), ensuring consistency during parallel updates. A Neuron Distributor module enables flexible partitioning strategies based on neuron count, percentage, identifiers, or network layers. This architecture enables cost-effective, scalable deployment of deep learning models on cloud infrastructure, reducing dependency on high-performance centralized compute resources.

Related papers

NeuroCoreX: An Open-Source FPGA-Based Spiking Neural Network Emulator with On-Chip Learning [0.0]
Spiking Neural Networks (SNNs) are computational models inspired by the structure and dynamics of biological neuronal networks.<n>NeuroCoreX is an FPGA-based emulator designed for the flexible co-design and testing of SNNs.
arXiv Detail & Related papers (2025-06-17T03:02:04Z)
Neuromorphic Wireless Split Computing with Multi-Level Spikes [69.73249913506042]
Neuromorphic computing uses spiking neural networks (SNNs) to perform inference tasks.<n> embedding a small payload within each spike exchanged between spiking neurons can enhance inference accuracy without increasing energy consumption.<n> split computing - where an SNN is partitioned across two devices - is a promising solution.<n>This paper presents the first comprehensive study of a neuromorphic wireless split computing architecture that employs multi-level SNNs.
arXiv Detail & Related papers (2024-11-07T14:08:35Z)
Simultaneous Weight and Architecture Optimization for Neural Networks [6.2241272327831485]
We introduce a novel neural network training framework that transforms the process by learning architecture and parameters simultaneously with gradient descent. Central to our approach is a multi-scale encoder-decoder, in which the encoder embeds pairs of neural networks with similar functionalities close to each other. Experiments demonstrate that our framework can discover sparse and compact neural networks maintaining a high performance.
arXiv Detail & Related papers (2024-10-10T19:57:36Z)
Message Passing Variational Autoregressive Network for Solving Intractable Ising Models [6.261096199903392]
Many deep neural networks have been used to solve Ising models, including autoregressive neural networks, convolutional neural networks, recurrent neural networks, and graph neural networks. Here we propose a variational autoregressive architecture with a message passing mechanism, which can effectively utilize the interactions between spin variables. The new network trained under an annealing framework outperforms existing methods in solving several prototypical Ising spin Hamiltonians, especially for larger spin systems at low temperatures.
arXiv Detail & Related papers (2024-04-09T11:27:07Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption. They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware. A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z)
POPPINS : A Population-Based Digital Spiking Neuromorphic Processor with Integer Quadratic Integrate-and-Fire Neurons [50.591267188664666]
We propose a population-based digital spiking neuromorphic processor in 180nm process technology with two hierarchy populations. The proposed approach enables the developments of biomimetic neuromorphic system and various low-power, and low-latency inference processing applications.
arXiv Detail & Related papers (2022-01-19T09:26:34Z)
CondenseNeXt: An Ultra-Efficient Deep Neural Network for Embedded Systems [0.0]
A Convolutional Neural Network (CNN) is a class of Deep Neural Network (DNN) widely used in the analysis of visual images captured by an image sensor. In this paper, we propose a neoteric variant of deep convolutional neural network architecture to ameliorate the performance of existing CNN architectures for real-time inference on embedded systems.
arXiv Detail & Related papers (2021-12-01T18:20:52Z)
Communication-Efficient Separable Neural Network for Distributed Inference on Edge Devices [2.28438857884398]
We propose a novel method of exploiting model parallelism to separate a neural network for distributed inferences. Under proper specifications of devices and configurations of models, our experiments show that the inference of large neural networks on edge clusters can be distributed and accelerated.
arXiv Detail & Related papers (2021-11-03T19:30:28Z)
Max and Coincidence Neurons in Neural Networks [0.07614628596146598]
We optimize networks containing models of the max and coincidence neurons using neural architecture search. We analyze the structure, operations, and neurons of optimized networks to develop a signal-processing ResNet. The developed network achieves an average of 2% improvement in accuracy and a 25% improvement in network size across a variety of datasets.
arXiv Detail & Related papers (2021-10-04T07:13:50Z)
Incremental Training of a Recurrent Neural Network Exploiting a Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning. We show how to extend the architecture of a simple RNN by separating its hidden state into different modules. We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z)
ResiliNet: Failure-Resilient Inference in Distributed Neural Networks [56.255913459850674]
We introduce ResiliNet, a scheme for making inference in distributed neural networks resilient to physical node failures. Failout simulates physical node failure conditions during training using dropout, and is specifically designed to improve the resiliency of distributed neural networks.
arXiv Detail & Related papers (2020-02-18T05:58:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.