Optimizing Neural Networks with Gradient Lexicase Selection
- URL: http://arxiv.org/abs/2312.12606v1
- Date: Tue, 19 Dec 2023 21:21:25 GMT
- Title: Optimizing Neural Networks with Gradient Lexicase Selection
- Authors: Li Ding, Lee Spector
- Abstract summary: A potential drawback of using aggregated performance measurement in machine learning is that models may learn to accept higher errors on some training cases as compromises for lower errors on others.
We propose Gradient Lexicase Selection, an optimization framework that combines gradient descent and lexicase selection in an evolutionary fashion.
Our experimental results demonstrate that the proposed method improves the generalization performance of various widely-used deep neural network architectures.
- Score: 7.554281369517276
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One potential drawback of using aggregated performance measurement in machine
learning is that models may learn to accept higher errors on some training
cases as compromises for lower errors on others, with the lower errors actually
being instances of overfitting. This can lead to both stagnation at local
optima and poor generalization. Lexicase selection is an uncompromising method
developed in evolutionary computation, which selects models on the basis of
sequences of individual training case errors instead of using aggregated
metrics such as loss and accuracy. In this paper, we investigate how lexicase
selection, in its general form, can be integrated into the context of deep
learning to enhance generalization. We propose Gradient Lexicase Selection, an
optimization framework that combines gradient descent and lexicase selection in
an evolutionary fashion. Our experimental results demonstrate that the proposed
method improves the generalization performance of various widely-used deep
neural network architectures across three image classification benchmarks.
Additionally, qualitative analysis suggests that our method assists networks in
learning more diverse representations. Our source code is available on GitHub:
https://github.com/ld-ing/gradient-lexicase.
Related papers
- Neural Network-Based Score Estimation in Diffusion Models: Optimization
and Generalization [12.812942188697326]
Diffusion models have emerged as a powerful tool rivaling GANs in generating high-quality samples with improved fidelity, flexibility, and robustness.
A key component of these models is to learn the score function through score matching.
Despite empirical success on various tasks, it remains unclear whether gradient-based algorithms can learn the score function with a provable accuracy.
arXiv Detail & Related papers (2024-01-28T08:13:56Z) - Explicit Foundation Model Optimization with Self-Attentive Feed-Forward
Neural Units [4.807347156077897]
Iterative approximation methods using backpropagation enable the optimization of neural networks, but they remain computationally expensive when used at scale.
This paper presents an efficient alternative for optimizing neural networks that reduces the costs of scaling neural networks and provides high-efficiency optimizations for low-resource applications.
arXiv Detail & Related papers (2023-11-13T17:55:07Z) - Learning to Branch in Combinatorial Optimization with Graph Pointer
Networks [17.729352126574902]
This paper proposes a graph pointer network model for learning the variable selection policy in the branch-and-bound.
The proposed model, which combines the graph neural network and the pointer mechanism, can effectively map from the solver state to the branching variable decisions.
arXiv Detail & Related papers (2023-07-04T01:56:07Z) - Learning Large-scale Neural Fields via Context Pruned Meta-Learning [60.93679437452872]
We introduce an efficient optimization-based meta-learning technique for large-scale neural field training.
We show how gradient re-scaling at meta-test time allows the learning of extremely high-quality neural fields.
Our framework is model-agnostic, intuitive, straightforward to implement, and shows significant reconstruction improvements for a wide range of signals.
arXiv Detail & Related papers (2023-02-01T17:32:16Z) - Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks.
We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights.
Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z) - Fast and scalable neuroevolution deep learning architecture search for
multivariate anomaly detection [0.0]
The work concentrates on improvements to multi-level neuroevolution approach for anomaly detection.
The presented framework can be used as an efficient learning network architecture method for any different unsupervised task.
arXiv Detail & Related papers (2021-12-10T16:14:43Z) - A Continuous Optimisation Benchmark Suite from Neural Network Regression [0.0]
Training neural networks is an optimisation task that has gained prominence with the recent successes of deep learning.
gradient descent variants are by far the most common choice with their trusted good performance on large-scale machine learning tasks.
We contribute CORNN, a suite for benchmarking the performance of any continuous black-box algorithm on neural network training problems.
arXiv Detail & Related papers (2021-09-12T20:24:11Z) - Understanding the Generalization of Adam in Learning Neural Networks
with Proper Regularization [118.50301177912381]
We show that Adam can converge to different solutions of the objective with provably different errors, even with weight decay globalization.
We show that if convex, and the weight decay regularization is employed, any optimization algorithms including Adam will converge to the same solution.
arXiv Detail & Related papers (2021-08-25T17:58:21Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Non-Gradient Manifold Neural Network [79.44066256794187]
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent.
We propose a novel manifold neural network based on non-gradient optimization.
arXiv Detail & Related papers (2021-06-15T06:39:13Z) - Meta-Solver for Neural Ordinary Differential Equations [77.8918415523446]
We investigate how the variability in solvers' space can improve neural ODEs performance.
We show that the right choice of solver parameterization can significantly affect neural ODEs models in terms of robustness to adversarial attacks.
arXiv Detail & Related papers (2021-03-15T17:26:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.