Related papers: GreenLightningAI: An Efficient AI System with Decoupled Structural and Quantitative Knowledge

GreenLightningAI: An Efficient AI System with Decoupled Structural and Quantitative Knowledge

URL: http://arxiv.org/abs/2312.09971v1
Date: Fri, 15 Dec 2023 17:34:11 GMT
Title: GreenLightningAI: An Efficient AI System with Decoupled Structural and Quantitative Knowledge
Authors: Jose Duato, Jose I. Mestre, Manuel F. Dolz and Enrique S. Quintana-Ort\'i
Abstract summary: Training powerful and popular deep neural networks comes at very high economic and environmental costs. This work takes a radically different approach by proposing GreenLightningAI. The new AI system stores the information required to select the system subset for a given sample. We show experimentally that the structural information can be kept unmodified when re-training the AI system with new samples.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The number and complexity of artificial intelligence (AI) applications is growing relentlessly. As a result, even with the many algorithmic and mathematical advances experienced over past decades as well as the impressive energy efficiency and computational capacity of current hardware accelerators, training the most powerful and popular deep neural networks comes at very high economic and environmental costs. Recognising that additional optimisations of conventional neural network training is very difficult, this work takes a radically different approach by proposing GreenLightningAI, a new AI system design consisting of a linear model that is capable of emulating the behaviour of deep neural networks by subsetting the model for each particular sample. The new AI system stores the information required to select the system subset for a given sample (referred to as structural information) separately from the linear model parameters (referred to as quantitative knowledge). In this paper we present a proof of concept, showing that the structural information stabilises far earlier than the quantitative knowledge. Additionally, we show experimentally that the structural information can be kept unmodified when re-training the AI system with new samples while still achieving a validation accuracy similar to that obtained when re-training a neural network with similar size. Since the proposed AI system is based on a linear model, multiple copies of the model, trained with different datasets, can be easily combined. This enables faster and greener (re)-training algorithms, including incremental re-training and federated incremental re-training.

Related papers

Manifold meta-learning for reduced-complexity neural system identification [1.0276024900942875]
We propose a meta-learning framework that discovers a low-dimensional manifold. This manifold is learned from a meta-dataset of input-output sequences generated by a class of related dynamical systems. Unlike bilevel meta-learning approaches, our method employs an auxiliary neural network to map datasets directly onto the learned manifold.
arXiv Detail & Related papers (2025-04-16T06:49:56Z)
Automatic Construction of Pattern Classifiers Capable of Continuous Incremental Learning and Unlearning Tasks Based on Compact-Sized Probabilistic Neural Network [0.0]
This paper proposes a novel approach to pattern classification using a probabilistic neural network model. The strategy is based on a compact-sized probabilistic neural network capable of continuous incremental learning and unlearning tasks.
arXiv Detail & Related papers (2025-01-01T05:02:53Z)
Mechanistic Neural Networks for Scientific Machine Learning [58.99592521721158]
We present Mechanistic Neural Networks, a neural network design for machine learning applications in the sciences. It incorporates a new Mechanistic Block in standard architectures to explicitly learn governing differential equations as representations. Central to our approach is a novel Relaxed Linear Programming solver (NeuRLP) inspired by a technique that reduces solving linear ODEs to solving linear programs.
arXiv Detail & Related papers (2024-02-20T15:23:24Z)
Epistemic Modeling Uncertainty of Rapid Neural Network Ensembles for Adaptive Learning [0.0]
A new type of neural network is presented using the rapid neural network paradigm. It is found that the proposed emulator embedded neural network trains near-instantaneously, typically without loss of prediction accuracy.
arXiv Detail & Related papers (2023-09-12T22:34:34Z)
Training Deep Surrogate Models with Large Scale Online Learning [48.7576911714538]
Deep learning algorithms have emerged as a viable alternative for obtaining fast solutions for PDEs. Models are usually trained on synthetic data generated by solvers, stored on disk and read back for training. It proposes an open source online training framework for deep surrogate models.
arXiv Detail & Related papers (2023-06-28T12:02:27Z)
Iterative self-transfer learning: A general methodology for response time-history prediction based on small dataset [0.0]
An iterative self-transfer learningmethod for training neural networks based on small datasets is proposed in this study. The results show that the proposed method can improve the model performance by near an order of magnitude on small datasets.
arXiv Detail & Related papers (2023-06-14T18:48:04Z)
Emulation Learning for Neuromimetic Systems [0.0]
Building on our recent research on neural quantization systems, results on learning quantized motions and resilience to channel dropouts are reported. We propose a general Deep Q Network (DQN) algorithm that can not only learn the trajectory but also exhibit the advantages of resilience to channel dropout.
arXiv Detail & Related papers (2023-05-04T22:47:39Z)
Interpretability of an Interaction Network for identifying $H \rightarrow b\bar{b}$ jets [4.553120911976256]
In recent times, AI models based on deep neural networks are becoming increasingly popular for many of these applications. We explore interpretability of AI models by examining an Interaction Network (IN) model designed to identify boosted $Hto bbarb$ jets. We additionally illustrate the activity of hidden layers within the IN model as Neural Activation Pattern (NAP) diagrams.
arXiv Detail & Related papers (2022-11-23T08:38:52Z)
Deep transfer learning for system identification using long short-term memory neural networks [0.0]
This paper proposes using two types of deep transfer learning, namely parameter fine-tuning and freezing, to reduce the data and computation requirements for system identification. Results show that compared with direct learning, our method accelerates learning by 10% to 50%, which also saves data and computing resources.
arXiv Detail & Related papers (2022-04-06T23:39:06Z)
Pretraining Graph Neural Networks for few-shot Analog Circuit Modeling and Design [68.1682448368636]
We present a supervised pretraining approach to learn circuit representations that can be adapted to new unseen topologies or unseen prediction tasks. To cope with the variable topological structure of different circuits we describe each circuit as a graph and use graph neural networks (GNNs) to learn node embeddings. We show that pretraining GNNs on prediction of output node voltages can encourage learning representations that can be adapted to new unseen topologies or prediction of new circuit level properties.
arXiv Detail & Related papers (2022-03-29T21:18:47Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
A novel Deep Neural Network architecture for non-linear system identification [78.69776924618505]
We present a novel Deep Neural Network (DNN) architecture for non-linear system identification. Inspired by fading memory systems, we introduce inductive bias (on the architecture) and regularization (on the loss function) This architecture allows for automatic complexity selection based solely on available data.
arXiv Detail & Related papers (2021-06-06T10:06:07Z)
Incremental Training of a Recurrent Neural Network Exploiting a Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning. We show how to extend the architecture of a simple RNN by separating its hidden state into different modules. We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.