Related papers: Generalised Perceptron Learning

Generalised Perceptron Learning

URL: http://arxiv.org/abs/2012.03642v1
Date: Mon, 7 Dec 2020 12:49:53 GMT
Title: Generalised Perceptron Learning
Authors: Xiaoyu Wang, Martin Benning
Abstract summary: We present a generalisation of Rosenblatt's traditional perceptron learning algorithm to the class of proximal activation functions. This interpretation paves the way for many new algorithms, of which we explore a novel variant of the iterative soft-thresholding algorithm for the learning of sparse perceptrons.
Score: 23.592657600394215
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a generalisation of Rosenblatt's traditional perceptron learning algorithm to the class of proximal activation functions and demonstrate how this generalisation can be interpreted as an incremental gradient method applied to a novel energy function. This novel energy function is based on a generalised Bregman distance, for which the gradient with respect to the weights and biases does not require the differentiation of the activation function. The interpretation as an energy minimisation algorithm paves the way for many new algorithms, of which we explore a novel variant of the iterative soft-thresholding algorithm for the learning of sparse perceptrons.

Related papers

Robustly Learning Monotone Single-Index Models [37.42736399673992]
We consider the basic problem of learning Single-Index Models with respect to the square loss under the Gaussian distribution.<n>Our main contribution is the first computationally efficient algorithm for this learning task, achieving a constant factor approximation.
arXiv Detail & Related papers (2025-08-06T17:37:06Z)
NeuralGrok: Accelerate Grokking by Neural Gradient Transformation [54.65707216563953]
We propose NeuralGrok, a gradient-based approach that learns an optimal gradient transformation to accelerate generalization of transformers in arithmetic tasks. Our experiments demonstrate that NeuralGrok significantly accelerates generalization, particularly in challenging arithmetic tasks. We also show that NeuralGrok promotes a more stable training paradigm, constantly reducing the model's complexity.
arXiv Detail & Related papers (2025-04-24T04:41:35Z)
Convergence of energy-based learning in linear resistive networks [2.9248916859490173]
Energy-based learning algorithms are well-suited to distributed implementations in analog electronic devices. We make a first step in this direction by analysing a particular energy-based learning algorithm, Contrastive Learning, applied to a network of linear adjustable resistors.
arXiv Detail & Related papers (2025-03-01T04:47:02Z)
The Differentiable Feasibility Pump [49.55771920271201]
This paper shows that the traditional feasibility pump and many of its follow-ups can be seen as gradient-descent algorithms with specific parameters. A central aspect of this reinterpretation is observing that the traditional algorithm differentiates the solution of the linear relaxation with respect to its cost.
arXiv Detail & Related papers (2024-11-05T22:26:51Z)
Bregman-divergence-based Arimoto-Blahut algorithm [53.64687146666141]
We generalize the Arimoto-Blahut algorithm to a general function defined over Bregman-divergence system. We propose a convex-optimization-free algorithm that can be applied to classical and quantum rate-distortion theory.
arXiv Detail & Related papers (2024-08-10T06:16:24Z)
Distributional Bellman Operators over Mean Embeddings [37.5480897544168]
We propose a novel framework for distributional reinforcement learning, based on learning finite-dimensional mean embeddings of return distributions. We derive several new algorithms for dynamic programming and temporal-difference learning based on this framework.
arXiv Detail & Related papers (2023-12-09T11:36:14Z)
A Compound Gaussian Least Squares Algorithm and Unrolled Network for Linear Inverse Problems [1.283555556182245]
This paper develops two new approaches to solving linear inverse problems. The first is an iterative algorithm that minimizes a regularized least squares objective function. The second is a deep neural network that corresponds to an "unrolling" or "unfolding" of the iterative algorithm.
arXiv Detail & Related papers (2023-05-18T17:05:09Z)
Exploring the role of parameters in variational quantum algorithms [59.20947681019466]
We introduce a quantum-control-inspired method for the characterization of variational quantum circuits using the rank of the dynamical Lie algebra. A promising connection is found between the Lie rank, the accuracy of calculated energies, and the requisite depth to attain target states via a given circuit architecture.
arXiv Detail & Related papers (2022-09-28T20:24:53Z)
On the Activation Function Dependence of the Spectral Bias of Neural Networks [0.0]
We study the phenomenon from the point of view of the spectral bias of neural networks. We provide a theoretical explanation for the spectral bias of ReLU neural networks by leveraging connections with the theory of finite element methods. We show that neural networks with the Hat activation function are trained significantly faster using gradient descent and ADAM.
arXiv Detail & Related papers (2022-08-09T17:40:57Z)
Active Learning for Transition State Calculation [3.399187058548169]
transition state (TS) calculation is a grand challenge for computational intensive energy function. To reduce the number of expensive computations of the true gradients, we propose an active learning framework. We show that the new method significantly decreases the required number of energy or force evaluations of the original model.
arXiv Detail & Related papers (2021-08-10T13:57:31Z)
Algorithm for initializing a generalized fermionic Gaussian state on a quantum computer [0.0]
We present explicit expressions for the central piece of a variational method developed by Shi et al. We derive iterative analytical expressions for the evaluation of expectation values of products of fermionic creation and subroutine operators. We present a simple gradient-descent-based algorithm that can be used as an optimization in combination with imaginary time evolution.
arXiv Detail & Related papers (2021-05-27T10:31:45Z)
Learning Frequency Domain Approximation for Binary Neural Networks [68.79904499480025]
We propose to estimate the gradient of sign function in the Fourier frequency domain using the combination of sine functions for training BNNs. The experiments on several benchmark datasets and neural architectures illustrate that the binary network learned using our method achieves the state-of-the-art accuracy.
arXiv Detail & Related papers (2021-03-01T08:25:26Z)
Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms. The learned algorithms are domain-agnostic and can generalize to new environments not seen during training. We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z)
Learned Block Iterative Shrinkage Thresholding Algorithm for Photothermal Super Resolution Imaging [52.42007686600479]
We propose a learned block-sparse optimization approach using an iterative algorithm unfolded into a deep neural network. We show the benefits of using a learned block iterative shrinkage thresholding algorithm that is able to learn the choice of regularization parameters.
arXiv Detail & Related papers (2020-12-07T09:27:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.