Related papers: Backpropagation-Free Learning Method for Correlated Fuzzy Neural Networks

Backpropagation-Free Learning Method for Correlated Fuzzy Neural Networks

URL: http://arxiv.org/abs/2012.01935v2
Date: Sat, 20 Nov 2021 17:24:36 GMT
Title: Backpropagation-Free Learning Method for Correlated Fuzzy Neural Networks
Authors: Armin Salimi-Badr and Mohammad Mehdi Ebadzadeh
Abstract summary: This paper proposes a novel stepwise learning approach based on estimating desired premise parts' outputs. It does not require backpropagating the output error to learn the premise parts' parameters. It is successfully applied to real-world time-series prediction and regression problems.
Score: 2.1320960069210475
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In this paper, a novel stepwise learning approach based on estimating desired premise parts' outputs by solving a constrained optimization problem is proposed. This learning approach does not require backpropagating the output error to learn the premise parts' parameters. Instead, the near best output values of the rules premise parts are estimated and their parameters are changed to reduce the error between current premise parts' outputs and the estimated desired ones. Therefore, the proposed learning method avoids error backpropagation, which lead to vanishing gradient and consequently getting stuck in a local optimum. The proposed method does not need any initialization method. This learning method is utilized to train a new Takagi-Sugeno-Kang (TSK) Fuzzy Neural Network with correlated fuzzy rules including many parameters in both premise and consequent parts, avoiding getting stuck in a local optimum due to vanishing gradient. To learn the proposed network parameters, first, a constrained optimization problem is introduced and solved to estimate the desired values of premise parts' output values. Next, the error between these values and the current ones is utilized to adapt the premise parts' parameters based on the gradient-descent (GD) approach. Afterward, the error between the desired and network's outputs is used to learn consequent parts' parameters by the GD method. The proposed paradigm is successfully applied to real-world time-series prediction and regression problems. According to experimental results, its performance outperforms other methods with a more parsimonious structure.

Related papers

BAPE: Learning an Explicit Bayes Classifier for Long-tailed Visual Recognition [78.70453964041718]
Current deep learning algorithms usually solve for the optimal classifier by emphimplicitly estimating the posterior probabilities.<n>This simple methodology has been proven effective for meticulously balanced academic benchmark datasets.<n>However, it is not applicable to the long-tailed data distributions in the real world.<n>This paper presents a novel approach (BAPE) that provides a more precise theoretical estimation of the data distributions.
arXiv Detail & Related papers (2025-06-29T15:12:50Z)
Convolution-weighting method for the physics-informed neural network: A Primal-Dual Optimization Perspective [14.65008276932511]
Physics-informed neural networks (PINNs) are extensively employed to solve partial differential equations (PDEs)<n>PINNs are typically optimized using a finite set of points, which poses significant challenges in guaranteeing their convergence and accuracy.<n>We propose a new weighting scheme that will adaptively change the weights to the loss functions from isolated points to their continuous neighborhood regions.
arXiv Detail & Related papers (2025-06-24T17:13:51Z)
Online Decision-Focused Learning [63.83903681295497]
Decision-focused learning (DFL) is an increasingly popular paradigm for training predictive models whose outputs are used in decision-making tasks.<n>We investigate DFL in dynamic environments where the objective function does not evolve over time.<n>We establish bounds on the expected dynamic regret, both when decision space is a simplex and when it is a general bounded convex polytope.
arXiv Detail & Related papers (2025-05-19T10:40:30Z)
Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture. Non-smooth regularization is often incorporated into machine learning tasks. We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z)
Sparse is Enough in Fine-tuning Pre-trained Large Language Models [98.46493578509039]
We propose a gradient-based sparse fine-tuning algorithm, named Sparse Increment Fine-Tuning (SIFT) We validate its effectiveness on a range of tasks including the GLUE Benchmark and Instruction-tuning.
arXiv Detail & Related papers (2023-12-19T06:06:30Z)
Adaptive operator learning for infinite-dimensional Bayesian inverse problems [7.716833952167609]
We develop an adaptive operator learning framework that can reduce modeling error gradually by forcing the surrogate to be accurate in local areas. We present a rigorous convergence guarantee in the linear case using the UKI framework. The numerical results show that our method can significantly reduce computational costs while maintaining inversion accuracy.
arXiv Detail & Related papers (2023-10-27T01:50:33Z)
DF2: Distribution-Free Decision-Focused Learning [53.2476224456902]
Decision-focused learning (DFL) has recently emerged as a powerful approach for predictthen-optimize problems. Existing end-to-end DFL methods are hindered by three significant bottlenecks: model error, sample average approximation error, and distribution-based parameterization of the expected objective. We present DF2 -- the first textit-free decision-focused learning method explicitly designed to address these three bottlenecks.
arXiv Detail & Related papers (2023-08-11T00:44:46Z)
Variational Linearized Laplace Approximation for Bayesian Deep Learning [11.22428369342346]
We propose a new method for approximating Linearized Laplace Approximation (LLA) using a variational sparse Gaussian Process (GP) Our method is based on the dual RKHS formulation of GPs and retains, as the predictive mean, the output of the original DNN. It allows for efficient optimization, which results in sub-linear training time in the size of the training dataset.
arXiv Detail & Related papers (2023-02-24T10:32:30Z)
Learning k-Level Structured Sparse Neural Networks Using Group Envelope Regularization [4.0554893636822]
We introduce a novel approach to deploy large-scale Deep Neural Networks on constrained resources. The method speeds up inference time and aims to reduce memory demand and power consumption.
arXiv Detail & Related papers (2022-12-25T15:40:05Z)
Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks. We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights. Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z)
Adaptive Self-supervision Algorithms for Physics-informed Neural Networks [59.822151945132525]
Physics-informed neural networks (PINNs) incorporate physical knowledge from the problem domain as a soft constraint on the loss function. We study the impact of the location of the collocation points on the trainability of these models. We propose a novel adaptive collocation scheme which progressively allocates more collocation points to areas where the model is making higher errors.
arXiv Detail & Related papers (2022-07-08T18:17:06Z)
AdaTerm: Adaptive T-Distribution Estimated Robust Moments for Noise-Robust Stochastic Gradient Optimization [14.531550983885772]
We propose AdaTerm, a novel approach that incorporates the Student's t-distribution to derive not only the first-order moment but also all associated statistics. This provides a unified treatment of the optimization process, offering a comprehensive framework under the statistical model of the t-distribution for the first time.
arXiv Detail & Related papers (2022-01-18T03:13:19Z)
De-homogenization using Convolutional Neural Networks [1.0323063834827415]
This paper presents a deep learning-based de-homogenization method for structural compliance minimization. For an appropriate choice of parameters, the de-homogenized designs perform within $7-25%$ of the homogenization-based solution.
arXiv Detail & Related papers (2021-05-10T09:50:06Z)
Deep learning: a statistical viewpoint [120.94133818355645]
Deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-perfect solutions to non-optimal training problems. We conjecture that specific principles underlie these phenomena.
arXiv Detail & Related papers (2021-03-16T16:26:36Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.