An Analytic End-to-End Deep Learning Algorithm based on Collaborative
Learning
- URL: http://arxiv.org/abs/2305.18594v2
- Date: Wed, 31 May 2023 02:53:32 GMT
- Title: An Analytic End-to-End Deep Learning Algorithm based on Collaborative
Learning
- Authors: Sitan Li and Chien Chern Cheah
- Abstract summary: This paper presents a convergence analysis for end-to-end deep learning of fully connected neural networks (FNN) with smooth activation functions.
The proposed method avoids any potential chattering problem, and it also does not easily lead to gradient vanishing problems.
- Score: 5.710971447109949
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In most control applications, theoretical analysis of the systems is crucial
in ensuring stability or convergence, so as to ensure safe and reliable
operations and also to gain a better understanding of the systems for further
developments. However, most current deep learning methods are black-box
approaches that are more focused on empirical studies. Recently, some results
have been obtained for convergence analysis of end-to end deep learning based
on non-smooth ReLU activation functions, which may result in chattering for
control tasks. This paper presents a convergence analysis for end-to-end deep
learning of fully connected neural networks (FNN) with smooth activation
functions. The proposed method therefore avoids any potential chattering
problem, and it also does not easily lead to gradient vanishing problems. The
proposed End-to-End algorithm trains multiple two-layer fully connected
networks concurrently and collaborative learning can be used to further combine
their strengths to improve accuracy. A classification case study based on fully
connected networks and MNIST dataset was done to demonstrate the performance of
the proposed approach. Then an online kinematics control task of a UR5e robot
arm was performed to illustrate the regression approximation and online
updating ability of our algorithm.
Related papers
- A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time"
It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - Batch Active Learning from the Perspective of Sparse Approximation [12.51958241746014]
Active learning enables efficient model training by leveraging interactions between machine learning agents and human annotators.
We study and propose a novel framework that formulates batch active learning from the sparse approximation's perspective.
Our active learning method aims to find an informative subset from the unlabeled data pool such that the corresponding training loss function approximates its full data pool counterpart.
arXiv Detail & Related papers (2022-11-01T03:20:28Z) - Sparse Interaction Additive Networks via Feature Interaction Detection
and Sparse Selection [10.191597755296163]
We develop a tractable selection algorithm to efficiently identify the necessary feature combinations.
Our proposed Sparse Interaction Additive Networks (SIAN) construct a bridge from simple and interpretable models to fully connected neural networks.
arXiv Detail & Related papers (2022-09-19T19:57:17Z) - On the Convergence of Distributed Stochastic Bilevel Optimization
Algorithms over a Network [55.56019538079826]
Bilevel optimization has been applied to a wide variety of machine learning models.
Most existing algorithms restrict their single-machine setting so that they are incapable of handling distributed data.
We develop novel decentralized bilevel optimization algorithms based on a gradient tracking communication mechanism and two different gradients.
arXiv Detail & Related papers (2022-06-30T05:29:52Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - A Heuristically Assisted Deep Reinforcement Learning Approach for
Network Slice Placement [0.7885276250519428]
We introduce a hybrid placement solution based on Deep Reinforcement Learning (DRL) and a dedicated optimization based on the Power of Two Choices principle.
The proposed Heuristically-Assisted DRL (HA-DRL) allows to accelerate the learning process and gain in resource usage when compared against other state-of-the-art approaches.
arXiv Detail & Related papers (2021-05-14T10:04:17Z) - Deep Multi-Task Learning for Cooperative NOMA: System Design and
Principles [52.79089414630366]
We develop a novel deep cooperative NOMA scheme, drawing upon the recent advances in deep learning (DL)
We develop a novel hybrid-cascaded deep neural network (DNN) architecture such that the entire system can be optimized in a holistic manner.
arXiv Detail & Related papers (2020-07-27T12:38:37Z) - Learning the Travelling Salesperson Problem Requires Rethinking
Generalization [9.176056742068813]
End-to-end training of neural network solvers for graph optimization problems such as the Travelling Salesperson Problem (TSP) have seen a surge of interest recently.
While state-of-the-art learning-driven approaches perform closely to classical solvers when trained on trivially small sizes, they are unable to generalize the learnt policy to larger instances at practical scales.
This work presents an end-to-end neural optimization pipeline that unifies several recent papers in order to identify the principled biases, model architectures and learning algorithms that promote generalization to instances larger than those seen in training.
arXiv Detail & Related papers (2020-06-12T10:14:15Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Convergence of End-to-End Training in Deep Unsupervised Contrastive
Learning [3.8073142980733]
Unsupervised contrastive learning has proven to be a powerful method for learning representations from unlabeled data.
This study provides theoretical insights into the practical success of these unsupervised methods.
arXiv Detail & Related papers (2020-02-17T14:35:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.