Improving Pre-Trained Weights Through Meta-Heuristics Fine-Tuning
- URL: http://arxiv.org/abs/2212.09447v1
- Date: Mon, 19 Dec 2022 13:40:26 GMT
- Title: Improving Pre-Trained Weights Through Meta-Heuristics Fine-Tuning
- Authors: Gustavo H. de Rosa, Mateus Roder, Jo\~ao Paulo Papa and Claudio F. G.
dos Santos
- Abstract summary: We propose to use meta-heuristic techniques to fine-tune pre-trained weights.
Experimental results show nature-inspired algorithms' capacity in exploring the neighborhood of pre-trained weights.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Machine Learning algorithms have been extensively researched throughout the
last decade, leading to unprecedented advances in a broad range of
applications, such as image classification and reconstruction, object
recognition, and text categorization. Nonetheless, most Machine Learning
algorithms are trained via derivative-based optimizers, such as the Stochastic
Gradient Descent, leading to possible local optimum entrapments and inhibiting
them from achieving proper performances. A bio-inspired alternative to
traditional optimization techniques, denoted as meta-heuristic, has received
significant attention due to its simplicity and ability to avoid local optimums
imprisonment. In this work, we propose to use meta-heuristic techniques to
fine-tune pre-trained weights, exploring additional regions of the search
space, and improving their effectiveness. The experimental evaluation comprises
two classification tasks (image and text) and is assessed under four literature
datasets. Experimental results show nature-inspired algorithms' capacity in
exploring the neighborhood of pre-trained weights, achieving superior results
than their counterpart pre-trained architectures. Additionally, a thorough
analysis of distinct architectures, such as Multi-Layer Perceptron and
Recurrent Neural Networks, attempts to visualize and provide more precise
insights into the most critical weights to be fine-tuned in the learning
process.
Related papers
- No Wrong Turns: The Simple Geometry Of Neural Networks Optimization
Paths [12.068608358926317]
First-order optimization algorithms are known to efficiently locate favorable minima in deep neural networks.
We focus on the fundamental geometric properties of sampled quantities of optimization on two key paths.
Our findings suggest that not only do optimization trajectories never encounter significant obstacles, but they also maintain stable dynamics during the majority of training.
arXiv Detail & Related papers (2023-06-20T22:10:40Z) - Learning Large-scale Neural Fields via Context Pruned Meta-Learning [60.93679437452872]
We introduce an efficient optimization-based meta-learning technique for large-scale neural field training.
We show how gradient re-scaling at meta-test time allows the learning of extremely high-quality neural fields.
Our framework is model-agnostic, intuitive, straightforward to implement, and shows significant reconstruction improvements for a wide range of signals.
arXiv Detail & Related papers (2023-02-01T17:32:16Z) - Deep Learning Training Procedure Augmentations [0.0]
Recent advances in Deep Learning have greatly improved performance on various tasks such as object detection, image segmentation, sentiment analysis.
While this has lead to great results, many of which with real-world applications, other relevant aspects of deep learning have remained neglected and unknown.
We will present several novel deep learning training techniques which, while capable of offering significant performance gains, also reveal several interesting analysis results regarding convergence speed, optimization landscape, and adversarial robustness.
arXiv Detail & Related papers (2022-11-25T22:31:11Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Benchmarking the Accuracy and Robustness of Feedback Alignment
Algorithms [1.2183405753834562]
Backpropagation is the default algorithm for training deep neural networks due to its simplicity, efficiency and high convergence rate.
In recent years, more biologically plausible learning methods have been proposed.
BioTorch is a software framework to create, train, and benchmark biologically motivated neural networks.
arXiv Detail & Related papers (2021-08-30T18:02:55Z) - RANK-NOSH: Efficient Predictor-Based Architecture Search via Non-Uniform
Successive Halving [74.61723678821049]
We propose NOn-uniform Successive Halving (NOSH), a hierarchical scheduling algorithm that terminates the training of underperforming architectures early to avoid wasting budget.
We formulate predictor-based architecture search as learning to rank with pairwise comparisons.
The resulting method - RANK-NOSH, reduces the search budget by 5x while achieving competitive or even better performance than previous state-of-the-art predictor-based methods on various spaces and datasets.
arXiv Detail & Related papers (2021-08-18T07:45:21Z) - Improving Music Performance Assessment with Contrastive Learning [78.8942067357231]
This study investigates contrastive learning as a potential method to improve existing MPA systems.
We introduce a weighted contrastive loss suitable for regression tasks applied to a convolutional neural network.
Our results show that contrastive-based methods are able to match and exceed SoTA performance for MPA regression tasks.
arXiv Detail & Related papers (2021-08-03T19:24:25Z) - How Powerful are Performance Predictors in Neural Architecture Search? [43.86743225322636]
We give the first large-scale study of performance predictors by analyzing 31 techniques.
We show that certain families of predictors can be combined to achieve even better predictive power.
arXiv Detail & Related papers (2021-04-02T17:57:16Z) - Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z) - Uncovering Coresets for Classification With Multi-Objective Evolutionary
Algorithms [0.8057006406834467]
A coreset is a subset of the training set, using which a machine learning algorithm obtains performances similar to what it would deliver if trained over the whole original data.
A novel approach is presented: candidate corsets are iteratively optimized, adding and removing samples.
A multi-objective evolutionary algorithm is used to minimize simultaneously the number of points in the set and the classification error.
arXiv Detail & Related papers (2020-02-20T09:59:56Z) - Large Batch Training Does Not Need Warmup [111.07680619360528]
Training deep neural networks using a large batch size has shown promising results and benefits many real-world applications.
In this paper, we propose a novel Complete Layer-wise Adaptive Rate Scaling (CLARS) algorithm for large-batch training.
Based on our analysis, we bridge the gap and illustrate the theoretical insights for three popular large-batch training techniques.
arXiv Detail & Related papers (2020-02-04T23:03:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.