Unlocking the Potential of Similarity Matching: Scalability, Supervision
and Pre-training
- URL: http://arxiv.org/abs/2308.02427v1
- Date: Wed, 2 Aug 2023 20:34:55 GMT
- Title: Unlocking the Potential of Similarity Matching: Scalability, Supervision
and Pre-training
- Authors: Yanis Bahroun, Shagesh Sridharan, Atithi Acharya, Dmitri B.
Chklovskii, Anirvan M. Sengupta
- Abstract summary: Backpropagation (BP) algorithm exhibits limitations in terms of biological plausibility, computational cost, and suitability for online learning.
This study focuses on the primarily unsupervised similarity matching (SM) framework, which aligns with observed mechanisms in biological systems.
- Score: 9.160910754837754
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While effective, the backpropagation (BP) algorithm exhibits limitations in
terms of biological plausibility, computational cost, and suitability for
online learning. As a result, there has been a growing interest in developing
alternative biologically plausible learning approaches that rely on local
learning rules. This study focuses on the primarily unsupervised similarity
matching (SM) framework, which aligns with observed mechanisms in biological
systems and offers online, localized, and biologically plausible algorithms. i)
To scale SM to large datasets, we propose an implementation of Convolutional
Nonnegative SM using PyTorch. ii) We introduce a localized supervised SM
objective reminiscent of canonical correlation analysis, facilitating stacking
SM layers. iii) We leverage the PyTorch implementation for pre-training
architectures such as LeNet and compare the evaluation of features against
BP-trained models. This work combines biologically plausible algorithms with
computational efficiency opening multiple avenues for further explorations.
Related papers
- Provably Efficient Representation Learning with Tractable Planning in
Low-Rank POMDP [81.00800920928621]
We study representation learning in partially observable Markov Decision Processes (POMDPs)
We first present an algorithm for decodable POMDPs that combines maximum likelihood estimation (MLE) and optimism in the face of uncertainty (OFU)
We then show how to adapt this algorithm to also work in the broader class of $gamma$-observable POMDPs.
arXiv Detail & Related papers (2023-06-21T16:04:03Z) - The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF.
Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples.
In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - BioLeaF: A Bio-plausible Learning Framework for Training of Spiking
Neural Networks [4.698975219970009]
We propose a new bio-plausible learning framework consisting of two components: a new architecture, and its supporting learning rules.
Under our microcircuit architecture, we employ the Spike-Timing-Dependent-Plasticity (STDP) rule operating in local compartments to update synaptic weights.
Our experiments show that the proposed framework demonstrates learning accuracy comparable to BP-based rules.
arXiv Detail & Related papers (2021-11-14T10:32:22Z) - An LSTM-based Plagiarism Detection via Attention Mechanism and a
Population-based Approach for Pre-Training Parameters with imbalanced Classes [1.9949261242626626]
This paper proposes an architecture based on a Long Short-Term Memory (LSTM) and attention mechanism called LSTM-AM-ABC.
Our proposed algorithm can find the initial values for model learning in all LSTM, attention mechanism, and feed-forward neural network, simultaneously.
arXiv Detail & Related papers (2021-10-17T09:20:03Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - CNN-based Approaches For Cross-Subject Classification in Motor Imagery:
From The State-of-The-Art to DynamicNet [0.2936007114555107]
Motor imagery (MI)-based brain-computer interface (BCI) systems are being increasingly employed to provide alternative means of communication and control.
accurately classifying MI from brain signals is essential to obtain reliable BCI systems.
Deep learning approaches have started to emerge as valid alternatives to standard machine learning techniques.
arXiv Detail & Related papers (2021-05-17T14:57:13Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.