Automatic Neural Network Hyperparameter Optimization for Extrapolation:
Lessons Learned from Visible and Near-Infrared Spectroscopy of Mango Fruit
- URL: http://arxiv.org/abs/2210.01124v1
- Date: Mon, 3 Oct 2022 00:41:05 GMT
- Title: Automatic Neural Network Hyperparameter Optimization for Extrapolation:
Lessons Learned from Visible and Near-Infrared Spectroscopy of Mango Fruit
- Authors: Matthew Dirks, David Poole
- Abstract summary: This paper considers automatic methods for configuring a neural network that extrapolates in time for the domain of visible and near-infrared (VNIR) spectroscopy.
To encourage the neural network model to extrapolate, we consider validating model configurations on samples that are shifted in time similar to the test set.
We find that ensembling improves the state-of-the-art model's variance and accuracy.
- Score: 7.462336024223667
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Neural networks are configured by choosing an architecture and hyperparameter
values; doing so often involves expert intuition and hand-tuning to find a
configuration that extrapolates well without overfitting. This paper considers
automatic methods for configuring a neural network that extrapolates in time
for the domain of visible and near-infrared (VNIR) spectroscopy. In particular,
we study the effect of (a) selecting samples for validating configurations and
(b) using ensembles.
Most of the time, models are built of the past to predict the future. To
encourage the neural network model to extrapolate, we consider validating model
configurations on samples that are shifted in time similar to the test set. We
experiment with three validation set choices: (1) a random sample of 1/3 of
non-test data (the technique used in previous work), (2) using the latest 1/3
(sorted by time), and (3) using a semantically meaningful subset of the data.
Hyperparameter optimization relies on the validation set to estimate test-set
error, but neural network variance obfuscates the true error value. Ensemble
averaging - computing the average across many neural networks - can reduce the
variance of prediction errors.
To test these methods, we do a comprehensive study of a held-out 2018 harvest
season of mango fruit given VNIR spectra from 3 prior years. We find that
ensembling improves the state-of-the-art model's variance and accuracy.
Furthermore, hyperparameter optimization experiments - with and without
ensemble averaging and with each validation set choice - show that when
ensembling is combined with using the latest 1/3 of samples as the validation
set, a neural network configuration is found automatically that is on par with
the state-of-the-art.
Related papers
- Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters.
In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z) - LARA: A Light and Anti-overfitting Retraining Approach for Unsupervised
Time Series Anomaly Detection [49.52429991848581]
We propose a Light and Anti-overfitting Retraining Approach (LARA) for deep variational auto-encoder based time series anomaly detection methods (VAEs)
This work aims to make three novel contributions: 1) the retraining process is formulated as a convex problem and can converge at a fast rate as well as prevent overfitting; 2) designing a ruminate block, which leverages the historical data without the need to store them; and 3) mathematically proving that when fine-tuning the latent vector and reconstructed data, the linear formations can achieve the least adjusting errors between the ground truths and the fine-tuned ones.
arXiv Detail & Related papers (2023-10-09T12:36:16Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Sampling weights of deep neural networks [1.2370077627846041]
We introduce a probability distribution, combined with an efficient sampling algorithm, for weights and biases of fully-connected neural networks.
In a supervised learning context, no iterative optimization or gradient computations of internal network parameters are needed.
We prove that sampled networks are universal approximators.
arXiv Detail & Related papers (2023-06-29T10:13:36Z) - Sparsifying Bayesian neural networks with latent binary variables and
normalizing flows [10.865434331546126]
We will consider two extensions to the latent binary Bayesian neural networks (LBBNN) method.
Firstly, by using the local reparametrization trick (LRT) to sample the hidden units directly, we get a more computationally efficient algorithm.
More importantly, by using normalizing flows on the variational posterior distribution of the LBBNN parameters, the network learns a more flexible variational posterior distribution than the mean field Gaussian.
arXiv Detail & Related papers (2023-05-05T09:40:28Z) - Hybrid machine-learned homogenization: Bayesian data mining and
convolutional neural networks [0.0]
This study aims to improve the machine learned prediction by developing novel feature descriptors.
The iterative development of feature descriptors resulted in 37 novel features, being able to reduce the prediction error by roughly one third.
A combination of the feature based approach and the convolutional neural network leads to a hybrid neural network.
arXiv Detail & Related papers (2023-02-24T09:59:29Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss.
We examine how these benign overfitting phenomena occur in a two-layer neural network setting.
We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Neural Model-based Optimization with Right-Censored Observations [42.530925002607376]
Neural networks (NNs) have been demonstrated to work well at the core of model-based optimization procedures.
We show that our trained regression models achieve a better predictive quality than several baselines.
arXiv Detail & Related papers (2020-09-29T07:32:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.