Hebbian Semi-Supervised Learning in a Sample Efficiency Setting
- URL: http://arxiv.org/abs/2103.09002v1
- Date: Tue, 16 Mar 2021 11:57:52 GMT
- Title: Hebbian Semi-Supervised Learning in a Sample Efficiency Setting
- Authors: Gabriele Lagani, Fabrizio Falchi, Claudio Gennaro, Giuseppe Amato
- Abstract summary: We propose a semisupervised training strategy for Deep Convolutional Neural Networks (DCNN)
All internal layers (both convolutional and fully connected) are pre-trained using an unsupervised approach based on Hebbian learning.
- Score: 10.026753669198108
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We propose to address the issue of sample efficiency, in Deep Convolutional
Neural Networks (DCNN), with a semisupervised training strategy that combines
Hebbian learning with gradient descent: all internal layers (both convolutional
and fully connected) are pre-trained using an unsupervised approach based on
Hebbian learning, and the last fully connected layer (the classification layer)
is using Stochastic Gradient Descent (SGD). In fact, as Hebbian learning is an
unsupervised learning method, its potential lies in the possibility of training
the internal layers of a DCNN without labeled examples. Only the final fully
connected layer has to be trained with labeled examples. We performed
experiments on various object recognition datasets, in different regimes of
sample efficiency, comparing our semi-supervised (Hebbian for internal layers +
SGD for the final fully layer) approach with end-to-end supervised
backpropagation training. The results show that, in regimes where the number of
available labeled samples is low, our semi-supervised approach outperforms full
backpropagation in almost all the cases.
Related papers
- Initialization Matters: On the Benign Overfitting of Two-Layer ReLU CNN with Fully Trainable Layers [20.25049261035324]
We extend the analysis to two-layer ReLU convolutional neural networks (CNNs) with fully trainable layers.
Our results show that the scaling of the output layer is crucial to the training dynamics.
In both settings, we provide nearly matching upper and lower bounds on the test errors.
arXiv Detail & Related papers (2024-10-24T20:15:45Z) - Enhancing Out-of-Distribution Detection with Multitesting-based Layer-wise Feature Fusion [11.689517005768046]
Out-of-distribution samples may exhibit shifts in local or global features compared to the training distribution.
We propose a novel framework, Multitesting-based Layer-wise Out-of-Distribution (OOD) Detection.
Our scheme effectively enhances the performance of out-of-distribution detection when compared to baseline methods.
arXiv Detail & Related papers (2024-03-16T04:35:04Z) - Learning with Noisy Labels Using Collaborative Sample Selection and
Contrastive Semi-Supervised Learning [76.00798972439004]
Collaborative Sample Selection (CSS) removes noisy samples from identified clean set.
We introduce a co-training mechanism with a contrastive loss in semi-supervised learning.
arXiv Detail & Related papers (2023-10-24T05:37:20Z) - Multi-Level Contrastive Learning for Dense Prediction Task [59.591755258395594]
We present Multi-Level Contrastive Learning for Dense Prediction Task (MCL), an efficient self-supervised method for learning region-level feature representation for dense prediction tasks.
Our method is motivated by the three key factors in detection: localization, scale consistency and recognition.
Our method consistently outperforms the recent state-of-the-art methods on various datasets with significant margins.
arXiv Detail & Related papers (2023-04-04T17:59:04Z) - WLD-Reg: A Data-dependent Within-layer Diversity Regularizer [98.78384185493624]
Neural networks are composed of multiple layers arranged in a hierarchical structure jointly trained with a gradient-based optimization.
We propose to complement this traditional 'between-layer' feedback with additional 'within-layer' feedback to encourage the diversity of the activations within the same layer.
We present an extensive empirical study confirming that the proposed approach enhances the performance of several state-of-the-art neural network models in multiple tasks.
arXiv Detail & Related papers (2023-01-03T20:57:22Z) - Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum.
Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels.
They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z) - HAVANA: Hard negAtiVe sAmples aware self-supervised coNtrastive leArning
for Airborne laser scanning point clouds semantic segmentation [9.310873951428238]
This work proposes a hard-negative sample aware self-supervised contrastive learning method to pre-train the model for semantic segmentation.
The results obtained by the proposed HAVANA method still exceed 94% of the supervised paradigm performance with full training set.
arXiv Detail & Related papers (2022-10-19T15:05:17Z) - Deep Features for CBIR with Scarce Data using Hebbian Learning [17.57322804741561]
We study the performance of biologically inspired textitHebbian learning algorithms in the development of feature extractors for Content Based Image Retrieval (CBIR) tasks.
Specifically, we consider a semi-supervised learning strategy in two steps: first, an unsupervised pre-training stage; second, the network is fine-tuned on the image dataset.
arXiv Detail & Related papers (2022-05-18T14:00:54Z) - Learning Low-rank Deep Neural Networks via Singular Vector Orthogonality
Regularization and Singular Value Sparsification [53.50708351813565]
We propose SVD training, the first method to explicitly achieve low-rank DNNs during training without applying SVD on every step.
We empirically show that SVD training can significantly reduce the rank of DNN layers and achieve higher reduction on computation load under the same accuracy.
arXiv Detail & Related papers (2020-04-20T02:40:43Z) - Embedding Propagation: Smoother Manifold for Few-Shot Classification [131.81692677836202]
We propose to use embedding propagation as an unsupervised non-parametric regularizer for manifold smoothing in few-shot classification.
We empirically show that embedding propagation yields a smoother embedding manifold.
We show that embedding propagation consistently improves the accuracy of the models in multiple semi-supervised learning scenarios by up to 16% points.
arXiv Detail & Related papers (2020-03-09T13:51:09Z) - Improve SGD Training via Aligning Mini-batches [22.58823484394866]
In-Training Distribution Matching (ITDM) is proposed to improve deep neural networks (DNNs) training and reduce overfitting.
Specifically, ITDM regularizes the feature extractor by matching the moments of distributions of different mini-batches in each iteration of SGD.
arXiv Detail & Related papers (2020-02-23T15:10:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.