Backpropagation-Free Learning Method for Correlated Fuzzy Neural
Networks
- URL: http://arxiv.org/abs/2012.01935v2
- Date: Sat, 20 Nov 2021 17:24:36 GMT
- Title: Backpropagation-Free Learning Method for Correlated Fuzzy Neural
Networks
- Authors: Armin Salimi-Badr and Mohammad Mehdi Ebadzadeh
- Abstract summary: This paper proposes a novel stepwise learning approach based on estimating desired premise parts' outputs.
It does not require backpropagating the output error to learn the premise parts' parameters.
It is successfully applied to real-world time-series prediction and regression problems.
- Score: 2.1320960069210475
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this paper, a novel stepwise learning approach based on estimating desired
premise parts' outputs by solving a constrained optimization problem is
proposed. This learning approach does not require backpropagating the output
error to learn the premise parts' parameters. Instead, the near best output
values of the rules premise parts are estimated and their parameters are
changed to reduce the error between current premise parts' outputs and the
estimated desired ones. Therefore, the proposed learning method avoids error
backpropagation, which lead to vanishing gradient and consequently getting
stuck in a local optimum. The proposed method does not need any initialization
method. This learning method is utilized to train a new Takagi-Sugeno-Kang
(TSK) Fuzzy Neural Network with correlated fuzzy rules including many
parameters in both premise and consequent parts, avoiding getting stuck in a
local optimum due to vanishing gradient. To learn the proposed network
parameters, first, a constrained optimization problem is introduced and solved
to estimate the desired values of premise parts' output values. Next, the error
between these values and the current ones is utilized to adapt the premise
parts' parameters based on the gradient-descent (GD) approach. Afterward, the
error between the desired and network's outputs is used to learn consequent
parts' parameters by the GD method. The proposed paradigm is successfully
applied to real-world time-series prediction and regression problems. According
to experimental results, its performance outperforms other methods with a more
parsimonious structure.
Related papers
- Sparse is Enough in Fine-tuning Pre-trained Large Language Models [98.46493578509039]
We propose a gradient-based sparse fine-tuning algorithm, named Sparse Increment Fine-Tuning (SIFT)
We validate its effectiveness on a range of tasks including the GLUE Benchmark and Instruction-tuning.
arXiv Detail & Related papers (2023-12-19T06:06:30Z) - Adaptive operator learning for infinite-dimensional Bayesian inverse problems [7.716833952167609]
We develop an adaptive operator learning framework that can reduce modeling error gradually by forcing the surrogate to be accurate in local areas.
We present a rigorous convergence guarantee in the linear case using the UKI framework.
The numerical results show that our method can significantly reduce computational costs while maintaining inversion accuracy.
arXiv Detail & Related papers (2023-10-27T01:50:33Z) - DF2: Distribution-Free Decision-Focused Learning [53.2476224456902]
Decision-focused learning (DFL) has recently emerged as a powerful approach for predictthen-optimize problems.
Existing end-to-end DFL methods are hindered by three significant bottlenecks: model error, sample average approximation error, and distribution-based parameterization of the expected objective.
We present DF2 -- the first textit-free decision-focused learning method explicitly designed to address these three bottlenecks.
arXiv Detail & Related papers (2023-08-11T00:44:46Z) - Variational Linearized Laplace Approximation for Bayesian Deep Learning [11.22428369342346]
We propose a new method for approximating Linearized Laplace Approximation (LLA) using a variational sparse Gaussian Process (GP)
Our method is based on the dual RKHS formulation of GPs and retains, as the predictive mean, the output of the original DNN.
It allows for efficient optimization, which results in sub-linear training time in the size of the training dataset.
arXiv Detail & Related papers (2023-02-24T10:32:30Z) - Learning k-Level Structured Sparse Neural Networks Using Group Envelope Regularization [4.0554893636822]
We introduce a novel approach to deploy large-scale Deep Neural Networks on constrained resources.
The method speeds up inference time and aims to reduce memory demand and power consumption.
arXiv Detail & Related papers (2022-12-25T15:40:05Z) - Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks.
We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights.
Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z) - Adaptive Self-supervision Algorithms for Physics-informed Neural
Networks [59.822151945132525]
Physics-informed neural networks (PINNs) incorporate physical knowledge from the problem domain as a soft constraint on the loss function.
We study the impact of the location of the collocation points on the trainability of these models.
We propose a novel adaptive collocation scheme which progressively allocates more collocation points to areas where the model is making higher errors.
arXiv Detail & Related papers (2022-07-08T18:17:06Z) - AdaTerm: Adaptive T-Distribution Estimated Robust Moments for
Noise-Robust Stochastic Gradient Optimization [14.531550983885772]
We propose AdaTerm, a novel approach that incorporates the Student's t-distribution to derive not only the first-order moment but also all associated statistics.
This provides a unified treatment of the optimization process, offering a comprehensive framework under the statistical model of the t-distribution for the first time.
arXiv Detail & Related papers (2022-01-18T03:13:19Z) - De-homogenization using Convolutional Neural Networks [1.0323063834827415]
This paper presents a deep learning-based de-homogenization method for structural compliance minimization.
For an appropriate choice of parameters, the de-homogenized designs perform within $7-25%$ of the homogenization-based solution.
arXiv Detail & Related papers (2021-05-10T09:50:06Z) - Deep learning: a statistical viewpoint [120.94133818355645]
Deep learning has revealed some major surprises from a theoretical perspective.
In particular, simple gradient methods easily find near-perfect solutions to non-optimal training problems.
We conjecture that specific principles underlie these phenomena.
arXiv Detail & Related papers (2021-03-16T16:26:36Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.