Exploring possible vector systems for faster training of neural networks with preconfigured latent spaces
- URL: http://arxiv.org/abs/2512.07509v2
- Date: Wed, 10 Dec 2025 13:54:33 GMT
- Title: Exploring possible vector systems for faster training of neural networks with preconfigured latent spaces
- Authors: Nikita Gabdullin,
- Abstract summary: An root system vectors can be used as targets for latent space configurations (LSC) to ensure the desired LS structure.<n>This paper provides a more general overview of possible vector systems for NN training along with their properties and methods for vector system construction.<n>It is also shown that using the minimum number of LS dimensions for a specific number of classes results in faster convergence.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The overall neural network (NN) performance is closely related to the properties of its embedding distribution in latent space (LS). It has recently been shown that predefined vector systems, specifically An root system vectors, can be used as targets for latent space configurations (LSC) to ensure the desired LS structure. One of the main LSC advantage is the possibility of training classifier NNs without classification layers, which facilitates training NNs on datasets with extremely large numbers of classes. This paper provides a more general overview of possible vector systems for NN training along with their properties and methods for vector system construction. These systems are used to configure LS of encoders and visual transformers to significantly speed up ImageNet-1K and 50k-600k classes LSC training. It is also shown that using the minimum number of LS dimensions for a specific number of classes results in faster convergence. The latter has potential advantages for reducing the size of vector databases used to store NN embeddings.
Related papers
- ReLaX-Net: Reusing Layers for Parameter-Efficient Physical Neural Networks [0.0]
We propose the Reuse of Layers for eXpanding a Neural Network (ReLaX-Net) architecture.<n>We use a simple layer-by-layer time-multiplexing scheme to increase the effective network depth and efficiently use the number of parameters.<n>Our results show that ReLaX-Net improves computational performance with only minor modifications to a conventional PNN.
arXiv Detail & Related papers (2025-10-28T07:25:41Z) - Using predefined vector systems as latent space configuration for neural network supervised training on data with arbitrarily large number of classes [0.0]
Supervised learning (SL) methods are indispensable for neural network (NN) training used to perform classification tasks.<n>We propose a methodology that allows one to train the same NN architecture regardless of the number of classes.
arXiv Detail & Related papers (2025-10-05T08:28:37Z) - Unveiling the Power of Sparse Neural Networks for Feature Selection [60.50319755984697]
Sparse Neural Networks (SNNs) have emerged as powerful tools for efficient feature selection.
We show that SNNs trained with dynamic sparse training (DST) algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
Our findings show that feature selection with SNNs trained with DST algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
arXiv Detail & Related papers (2024-08-08T16:48:33Z) - Latent space configuration for improved generalization in supervised autoencoder neural networks [0.0]
We propose two methods for obtaining LS with desired topology, called LS configuration.<n>Knowing LS configuration allows to define similarity measure in LS to predict labels or estimate similarity for multiple inputs.<n>We show that SAE trained for clothes texture classification using the proposed method generalizes well to unseen data from LIP, Market1501, and WildTrack datasets without fine-tuning.
arXiv Detail & Related papers (2024-02-13T13:25:51Z) - Unifying Synergies between Self-supervised Learning and Dynamic
Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms.
We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting.
The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z) - Training Spiking Neural Networks with Local Tandem Learning [96.32026780517097]
Spiking neural networks (SNNs) are shown to be more biologically plausible and energy efficient than their predecessors.
In this paper, we put forward a generalized learning rule, termed Local Tandem Learning (LTL)
We demonstrate rapid network convergence within five training epochs on the CIFAR-10 dataset while having low computational complexity.
arXiv Detail & Related papers (2022-10-10T10:05:00Z) - Adaptive Machine Learning for Time-Varying Systems: Low Dimensional
Latent Space Tuning [91.3755431537592]
We present a recently developed method of adaptive machine learning for time-varying systems.
Our approach is to map very high (N>100k) dimensional inputs into the low dimensional (N2) latent space at the output of the encoder section of an encoder-decoder CNN.
This method allows us to learn correlations within and to track their evolution in real time based on feedback without interrupts.
arXiv Detail & Related papers (2021-07-13T16:05:28Z) - Generalized Learning Vector Quantization for Classification in
Randomized Neural Networks and Hyperdimensional Computing [4.4886210896619945]
We propose a modified RVFL network that avoids computationally expensive matrix operations during training.
The proposed approach achieved state-of-the-art accuracy on a collection of datasets from the UCI Machine Learning Repository.
arXiv Detail & Related papers (2021-06-17T21:17:17Z) - OSLNet: Deep Small-Sample Classification with an Orthogonal Softmax
Layer [77.90012156266324]
This paper aims to find a subspace of neural networks that can facilitate a large decision margin.
We propose the Orthogonal Softmax Layer (OSL), which makes the weight vectors in the classification layer remain during both the training and test processes.
Experimental results demonstrate that the proposed OSL has better performance than the methods used for comparison on four small-sample benchmark datasets.
arXiv Detail & Related papers (2020-04-20T02:41:01Z) - Large-Scale Gradient-Free Deep Learning with Recursive Local
Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources.
Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize.
We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.