Related papers: Improving Pre-Trained Weights Through Meta-Heuristics Fine-Tuning

Improving Pre-Trained Weights Through Meta-Heuristics Fine-Tuning

URL: http://arxiv.org/abs/2212.09447v1
Date: Mon, 19 Dec 2022 13:40:26 GMT
Title: Improving Pre-Trained Weights Through Meta-Heuristics Fine-Tuning
Authors: Gustavo H. de Rosa, Mateus Roder, Jo\~ao Paulo Papa and Claudio F. G. dos Santos
Abstract summary: We propose to use meta-heuristic techniques to fine-tune pre-trained weights. Experimental results show nature-inspired algorithms' capacity in exploring the neighborhood of pre-trained weights.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Machine Learning algorithms have been extensively researched throughout the last decade, leading to unprecedented advances in a broad range of applications, such as image classification and reconstruction, object recognition, and text categorization. Nonetheless, most Machine Learning algorithms are trained via derivative-based optimizers, such as the Stochastic Gradient Descent, leading to possible local optimum entrapments and inhibiting them from achieving proper performances. A bio-inspired alternative to traditional optimization techniques, denoted as meta-heuristic, has received significant attention due to its simplicity and ability to avoid local optimums imprisonment. In this work, we propose to use meta-heuristic techniques to fine-tune pre-trained weights, exploring additional regions of the search space, and improving their effectiveness. The experimental evaluation comprises two classification tasks (image and text) and is assessed under four literature datasets. Experimental results show nature-inspired algorithms' capacity in exploring the neighborhood of pre-trained weights, achieving superior results than their counterpart pre-trained architectures. Additionally, a thorough analysis of distinct architectures, such as Multi-Layer Perceptron and Recurrent Neural Networks, attempts to visualize and provide more precise insights into the most critical weights to be fine-tuned in the learning process.

Related papers

No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths [12.068608358926317]
First-order optimization algorithms are known to efficiently locate favorable minima in deep neural networks. We focus on the fundamental geometric properties of sampled quantities of optimization on two key paths. Our findings suggest that not only do optimization trajectories never encounter significant obstacles, but they also maintain stable dynamics during the majority of training.
arXiv Detail & Related papers (2023-06-20T22:10:40Z)
Learning Large-scale Neural Fields via Context Pruned Meta-Learning [60.93679437452872]
We introduce an efficient optimization-based meta-learning technique for large-scale neural field training. We show how gradient re-scaling at meta-test time allows the learning of extremely high-quality neural fields. Our framework is model-agnostic, intuitive, straightforward to implement, and shows significant reconstruction improvements for a wide range of signals.
arXiv Detail & Related papers (2023-02-01T17:32:16Z)
Deep Learning Training Procedure Augmentations [0.0]
Recent advances in Deep Learning have greatly improved performance on various tasks such as object detection, image segmentation, sentiment analysis. While this has lead to great results, many of which with real-world applications, other relevant aspects of deep learning have remained neglected and unknown. We will present several novel deep learning training techniques which, while capable of offering significant performance gains, also reveal several interesting analysis results regarding convergence speed, optimization landscape, and adversarial robustness.
arXiv Detail & Related papers (2022-11-25T22:31:11Z)
Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z)
Benchmarking the Accuracy and Robustness of Feedback Alignment Algorithms [1.2183405753834562]
Backpropagation is the default algorithm for training deep neural networks due to its simplicity, efficiency and high convergence rate. In recent years, more biologically plausible learning methods have been proposed. BioTorch is a software framework to create, train, and benchmark biologically motivated neural networks.
arXiv Detail & Related papers (2021-08-30T18:02:55Z)
RANK-NOSH: Efficient Predictor-Based Architecture Search via Non-Uniform Successive Halving [74.61723678821049]
We propose NOn-uniform Successive Halving (NOSH), a hierarchical scheduling algorithm that terminates the training of underperforming architectures early to avoid wasting budget. We formulate predictor-based architecture search as learning to rank with pairwise comparisons. The resulting method - RANK-NOSH, reduces the search budget by 5x while achieving competitive or even better performance than previous state-of-the-art predictor-based methods on various spaces and datasets.
arXiv Detail & Related papers (2021-08-18T07:45:21Z)
Improving Music Performance Assessment with Contrastive Learning [78.8942067357231]
This study investigates contrastive learning as a potential method to improve existing MPA systems. We introduce a weighted contrastive loss suitable for regression tasks applied to a convolutional neural network. Our results show that contrastive-based methods are able to match and exceed SoTA performance for MPA regression tasks.
arXiv Detail & Related papers (2021-08-03T19:24:25Z)
How Powerful are Performance Predictors in Neural Architecture Search? [43.86743225322636]
We give the first large-scale study of performance predictors by analyzing 31 techniques. We show that certain families of predictors can be combined to achieve even better predictive power.
arXiv Detail & Related papers (2021-04-02T17:57:16Z)
Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z)
Uncovering Coresets for Classification With Multi-Objective Evolutionary Algorithms [0.8057006406834467]
A coreset is a subset of the training set, using which a machine learning algorithm obtains performances similar to what it would deliver if trained over the whole original data. A novel approach is presented: candidate corsets are iteratively optimized, adding and removing samples. A multi-objective evolutionary algorithm is used to minimize simultaneously the number of points in the set and the classification error.
arXiv Detail & Related papers (2020-02-20T09:59:56Z)
Large Batch Training Does Not Need Warmup [111.07680619360528]
Training deep neural networks using a large batch size has shown promising results and benefits many real-world applications. In this paper, we propose a novel Complete Layer-wise Adaptive Rate Scaling (CLARS) algorithm for large-batch training. Based on our analysis, we bridge the gap and illustrate the theoretical insights for three popular large-batch training techniques.
arXiv Detail & Related papers (2020-02-04T23:03:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.