LowDINO -- A Low Parameter Self Supervised Learning Model
- URL: http://arxiv.org/abs/2305.17791v1
- Date: Sun, 28 May 2023 18:34:59 GMT
- Title: LowDINO -- A Low Parameter Self Supervised Learning Model
- Authors: Sai Krishna Prathapaneni, Shvejan Shashank and Srikar Reddy K
- Abstract summary: This research aims to explore the possibility of designing a neural network architecture that allows for small networks to adopt the properties of huge networks.
Previous studies have shown that using convolutional neural networks (ConvNets) can provide inherent inductive bias.
To reduce the number of parameters, attention mechanisms are utilized through the usage of MobileViT blocks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This research aims to explore the possibility of designing a neural network
architecture that allows for small networks to adopt the properties of huge
networks, which have shown success in self-supervised learning (SSL), for all
the downstream tasks like image classification, segmentation, etc. Previous
studies have shown that using convolutional neural networks (ConvNets) can
provide inherent inductive bias, which is crucial for learning representations
in deep learning models. To reduce the number of parameters, attention
mechanisms are utilized through the usage of MobileViT blocks, resulting in a
model with less than 5 million parameters. The model is trained using
self-distillation with momentum encoder and a student-teacher architecture is
also employed, where the teacher weights use vision transformers (ViTs) from
recent SOTA SSL models. The model is trained on the ImageNet1k dataset. This
research provides an approach for designing smaller, more efficient neural
network architectures that can perform SSL tasks comparable to heavy models
Related papers
- On Learnable Parameters of Optimal and Suboptimal Deep Learning Models [2.889799048595314]
We study the structural and operational aspects of deep learning models.
Our research focuses on the nuances of learnable parameters (weight) statistics, distribution, node interaction, and visualization.
arXiv Detail & Related papers (2024-08-21T15:50:37Z) - Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity.
Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z) - On the Steganographic Capacity of Selected Learning Models [1.0640226829362012]
We consider the question of the steganographic capacity of learning models.
For a wide range of models, we determine the number of low-order bits that can be overwritten.
Of the models tested, the steganographic capacity ranges from 7.04 KB for our LR experiments, to 44.74 MB for InceptionV3.
arXiv Detail & Related papers (2023-08-29T10:41:34Z) - NCTV: Neural Clamping Toolkit and Visualization for Neural Network
Calibration [66.22668336495175]
A lack of consideration for neural network calibration will not gain trust from humans.
We introduce the Neural Clamping Toolkit, the first open-source framework designed to help developers employ state-of-the-art model-agnostic calibrated models.
arXiv Detail & Related papers (2022-11-29T15:03:05Z) - Effective Self-supervised Pre-training on Low-compute Networks without
Distillation [6.530011859253459]
Reported performance of self-supervised learning has trailed behind standard supervised pre-training by a large margin.
Most prior works attribute this poor performance to the capacity bottleneck of the low-compute networks.
We take a closer at what are the detrimental factors causing the practical limitations, and whether they are intrinsic to the self-supervised low-compute setting.
arXiv Detail & Related papers (2022-10-06T10:38:07Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - The Self-Simplifying Machine: Exploiting the Structure of Piecewise
Linear Neural Networks to Create Interpretable Models [0.0]
We introduce novel methodology toward simplification and increased interpretability of Piecewise Linear Neural Networks for classification tasks.
Our methods include the use of a trained, deep network to produce a well-performing, single-hidden-layer network without further training.
On these methods, we conduct preliminary studies of model performance, as well as a case study on Wells Fargo's Home Lending dataset.
arXiv Detail & Related papers (2020-12-02T16:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.