Related papers: Sparse Training Theory for Scalable and Efficient Agents

Sparse Training Theory for Scalable and Efficient Agents

URL: http://arxiv.org/abs/2103.01636v1
Date: Tue, 2 Mar 2021 10:48:29 GMT
Title: Sparse Training Theory for Scalable and Efficient Agents
Authors: Decebal Constantin Mocanu, Elena Mocanu, Tiago Pinto, Selima Curci, Phuong H. Nguyen, Madeleine Gibescu, Damien Ernst, Zita A. Vale
Abstract summary: Deep Neural Networks have proven to cope perfectly with all learning paradigms, i.e. supervised, unsupervised, and reinforcement learning. Traditional deep learning approaches make use of cloud computing facilities and do not scale well to autonomous agents with low computational resources. This paper discusses sparse training state-of-the-art, its challenges and limitations while introducing a couple of new theoretical research directions.
Score: 5.71531053864579
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A fundamental task for artificial intelligence is learning. Deep Neural Networks have proven to cope perfectly with all learning paradigms, i.e. supervised, unsupervised, and reinforcement learning. Nevertheless, traditional deep learning approaches make use of cloud computing facilities and do not scale well to autonomous agents with low computational resources. Even in the cloud, they suffer from computational and memory limitations, and they cannot be used to model adequately large physical worlds for agents which assume networks with billions of neurons. These issues are addressed in the last few years by the emerging topic of sparse training, which trains sparse networks from scratch. This paper discusses sparse training state-of-the-art, its challenges and limitations while introducing a couple of new theoretical research directions which has the potential of alleviating sparse training limitations to push deep learning scalability well beyond its current boundaries. Nevertheless, the theoretical advancements impact in complex multi-agents settings is discussed from a real-world perspective, using the smart grid case study.

Related papers

Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning. Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z)
Brain-Inspired Computational Intelligence via Predictive Coding [89.6335791546526]
Predictive coding (PC) has shown promising performance in machine intelligence tasks. PC can model information processing in different brain areas, can be used in cognitive control and robotics.
arXiv Detail & Related papers (2023-08-15T16:37:16Z)
Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems. We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z)
Generalized Uncertainty of Deep Neural Networks: Taxonomy and Applications [1.9671123873378717]
We show that the uncertainty of deep neural networks is not only important in a sense of interpretability and transparency, but also crucial in further advancing their performance. We will generalize the definition of the uncertainty of deep neural networks to any number or vector that is associated with an input or an input-label pair, and catalog existing methods on mining'' such uncertainty from a deep model.
arXiv Detail & Related papers (2023-02-02T22:02:33Z)
Predictive Coding: Towards a Future of Deep Learning beyond Backpropagation? [41.58529335439799]
The backpropagation of error algorithm used to train deep neural networks has been fundamental to the successes of deep learning. Recent work has developed the idea into a general-purpose algorithm able to train neural networks using only local computations. We show the substantially greater flexibility of predictive coding networks against equivalent deep neural networks.
arXiv Detail & Related papers (2022-02-18T22:57:03Z)
Edge-Cloud Polarization and Collaboration: A Comprehensive Survey [61.05059817550049]
We conduct a systematic review for both cloud and edge AI. We are the first to set up the collaborative learning mechanism for cloud and edge modeling. We discuss potentials and practical experiences of some on-going advanced edge AI topics.
arXiv Detail & Related papers (2021-11-11T05:58:23Z)
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks [78.47459801017959]
Sparsity can reduce the memory footprint of regular networks to fit mobile devices. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice.
arXiv Detail & Related papers (2021-01-31T22:48:50Z)
Memristors -- from In-memory computing, Deep Learning Acceleration, Spiking Neural Networks, to the Future of Neuromorphic and Bio-inspired Computing [25.16076541420544]
Machine learning, particularly in the form of deep learning, has driven most of the recent fundamental developments in artificial intelligence. Deep learning has been successfully applied in areas such as object/pattern recognition, speech and natural language processing, self-driving vehicles, intelligent self-diagnostics tools, autonomous robots, knowledgeable personal assistants, and monitoring. This paper reviews the case for a novel beyond CMOS hardware technology, memristors, as a potential solution for the implementation of power-efficient in-memory computing, deep learning accelerators, and spiking neural networks.
arXiv Detail & Related papers (2020-04-30T16:49:03Z)
Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC. To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.