Related papers: LowDINO -- A Low Parameter Self Supervised Learning Model

LowDINO -- A Low Parameter Self Supervised Learning Model

URL: http://arxiv.org/abs/2305.17791v1
Date: Sun, 28 May 2023 18:34:59 GMT
Title: LowDINO -- A Low Parameter Self Supervised Learning Model
Authors: Sai Krishna Prathapaneni, Shvejan Shashank and Srikar Reddy K
Abstract summary: This research aims to explore the possibility of designing a neural network architecture that allows for small networks to adopt the properties of huge networks. Previous studies have shown that using convolutional neural networks (ConvNets) can provide inherent inductive bias. To reduce the number of parameters, attention mechanisms are utilized through the usage of MobileViT blocks.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This research aims to explore the possibility of designing a neural network architecture that allows for small networks to adopt the properties of huge networks, which have shown success in self-supervised learning (SSL), for all the downstream tasks like image classification, segmentation, etc. Previous studies have shown that using convolutional neural networks (ConvNets) can provide inherent inductive bias, which is crucial for learning representations in deep learning models. To reduce the number of parameters, attention mechanisms are utilized through the usage of MobileViT blocks, resulting in a model with less than 5 million parameters. The model is trained using self-distillation with momentum encoder and a student-teacher architecture is also employed, where the teacher weights use vision transformers (ViTs) from recent SOTA SSL models. The model is trained on the ImageNet1k dataset. This research provides an approach for designing smaller, more efficient neural network architectures that can perform SSL tasks comparable to heavy models

Related papers

Exploring Deep Learning Models for EEG Neural Decoding [2.0099933815960256]
THINGS initiative provides a large EEG dataset of 46 subjects watching rapidly shown images. We test the feasibility of using this method for decoding high-level object features using recent deep learning models. We show that the linear model is not able to solve the decoding task, while almost all the deep learning models are successful.
arXiv Detail & Related papers (2025-03-20T08:02:09Z)
On Learnable Parameters of Optimal and Suboptimal Deep Learning Models [2.889799048595314]
We study the structural and operational aspects of deep learning models. Our research focuses on the nuances of learnable parameters (weight) statistics, distribution, node interaction, and visualization.
arXiv Detail & Related papers (2024-08-21T15:50:37Z)
NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals [58.83169560132308]
We introduce NNsight and NDIF, technologies that work in tandem to enable scientific study of the representations and computations learned by very large neural networks.
arXiv Detail & Related papers (2024-07-18T17:59:01Z)
Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning. Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z)
Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning. Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation. Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity. Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z)
On the Steganographic Capacity of Selected Learning Models [1.0640226829362012]
We consider the question of the steganographic capacity of learning models. For a wide range of models, we determine the number of low-order bits that can be overwritten. Of the models tested, the steganographic capacity ranges from 7.04 KB for our LR experiments, to 44.74 MB for InceptionV3.
arXiv Detail & Related papers (2023-08-29T10:41:34Z)
NCTV: Neural Clamping Toolkit and Visualization for Neural Network Calibration [66.22668336495175]
A lack of consideration for neural network calibration will not gain trust from humans. We introduce the Neural Clamping Toolkit, the first open-source framework designed to help developers employ state-of-the-art model-agnostic calibrated models.
arXiv Detail & Related papers (2022-11-29T15:03:05Z)
Effective Self-supervised Pre-training on Low-compute Networks without Distillation [6.530011859253459]
Reported performance of self-supervised learning has trailed behind standard supervised pre-training by a large margin. Most prior works attribute this poor performance to the capacity bottleneck of the low-compute networks. We take a closer at what are the detrimental factors causing the practical limitations, and whether they are intrinsic to the self-supervised low-compute setting.
arXiv Detail & Related papers (2022-10-06T10:38:07Z)
Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters. We find that our approach successfully generates parameters for a wide range of loss prompts. We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
The Self-Simplifying Machine: Exploiting the Structure of Piecewise Linear Neural Networks to Create Interpretable Models [0.0]
We introduce novel methodology toward simplification and increased interpretability of Piecewise Linear Neural Networks for classification tasks. Our methods include the use of a trained, deep network to produce a well-performing, single-hidden-layer network without further training. On these methods, we conduct preliminary studies of model performance, as well as a case study on Wells Fargo's Home Lending dataset.
arXiv Detail & Related papers (2020-12-02T16:02:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.