Mutual Information Decay Curves and Hyper-Parameter Grid Search Design
for Recurrent Neural Architectures
- URL: http://arxiv.org/abs/2012.04632v1
- Date: Tue, 8 Dec 2020 18:52:01 GMT
- Title: Mutual Information Decay Curves and Hyper-Parameter Grid Search Design
for Recurrent Neural Architectures
- Authors: Abhijit Mahalunkar and John D. Kelleher
- Abstract summary: We use mutual information to analyze long distance dependencies (LDDs) within a dataset.
We obtain state-of-the-art results for DilatedRNNs across a range of benchmark datasets.
- Score: 1.2894104422808241
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present an approach to design the grid searches for hyper-parameter
optimization for recurrent neural architectures. The basis for this approach is
the use of mutual information to analyze long distance dependencies (LDDs)
within a dataset. We also report a set of experiments that demonstrate how
using this approach, we obtain state-of-the-art results for DilatedRNNs across
a range of benchmark datasets.
Related papers
- POMONAG: Pareto-Optimal Many-Objective Neural Architecture Generator [4.09225917049674]
Transferable NAS has emerged, generalizing the search process from dataset-dependent to task-dependent.
This paper introduces POMONAG, extending DiffusionNAG via a many-optimal diffusion process.
Results were validated on two search spaces -- NAS201 and MobileNetV3 -- and evaluated across 15 image classification datasets.
arXiv Detail & Related papers (2024-09-30T16:05:29Z) - Heterogeneous Learning Rate Scheduling for Neural Architecture Search on Long-Tailed Datasets [0.0]
We propose a novel adaptive learning rate scheduling strategy tailored for the architecture parameters of DARTS.
Our approach dynamically adjusts the learning rate of the architecture parameters based on the training epoch, preventing the disruption of well-trained representations.
arXiv Detail & Related papers (2024-06-11T07:32:25Z) - Quantifying uncertainty for deep learning based forecasting and
flow-reconstruction using neural architecture search ensembles [0.8258451067861933]
We present an automated approach to deep neural network (DNN) discovery and demonstrate how this may also be utilized for ensemble-based uncertainty quantification.
We highlight how the proposed method not only discovers high-performing neural network ensembles for our tasks, but also quantifies uncertainty seamlessly.
We demonstrate the feasibility of this framework for two tasks - forecasting from historical data and flow reconstruction from sparse sensors for the sea-surface temperature.
arXiv Detail & Related papers (2023-02-20T03:57:06Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Exploiting Temporal Structures of Cyclostationary Signals for
Data-Driven Single-Channel Source Separation [98.95383921866096]
We study the problem of single-channel source separation (SCSS)
We focus on cyclostationary signals, which are particularly suitable in a variety of application domains.
We propose a deep learning approach using a U-Net architecture, which is competitive with the minimum MSE estimator.
arXiv Detail & Related papers (2022-08-22T14:04:56Z) - A Local Optima Network Analysis of the Feedforward Neural Architecture
Space [0.0]
Local optima network (LON) analysis is a derivative of the fitness landscape of candidate solutions.
LONs may provide a viable paradigm by which to analyse and optimise neural architectures.
arXiv Detail & Related papers (2022-06-02T08:09:17Z) - A novel Deep Neural Network architecture for non-linear system
identification [78.69776924618505]
We present a novel Deep Neural Network (DNN) architecture for non-linear system identification.
Inspired by fading memory systems, we introduce inductive bias (on the architecture) and regularization (on the loss function)
This architecture allows for automatic complexity selection based solely on available data.
arXiv Detail & Related papers (2021-06-06T10:06:07Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - Deep Cellular Recurrent Network for Efficient Analysis of Time-Series
Data with Spatial Information [52.635997570873194]
This work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to process complex multi-dimensional time series data with spatial information.
The proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.
arXiv Detail & Related papers (2021-01-12T20:08:18Z) - Hyperparameter Optimization in Neural Networks via Structured Sparse
Recovery [54.60327265077322]
We study two important problems in the automated design of neural networks through the lens of sparse recovery methods.
In the first part of this paper, we establish a novel connection between HPO and structured sparse recovery.
In the second part of this paper, we establish a connection between NAS and structured sparse recovery.
arXiv Detail & Related papers (2020-07-07T00:57:09Z) - A Neural Architecture for Detecting Confusion in Eye-tracking Data [1.8655840060559168]
We introduce an architecture that uses RNN and CNN sub-models in parallel to take advantage of the temporal and visuospatial aspects of our data.
Our model outperforms an existing model based on Random Forests resulting in a 22% improvement in combined sensitivity & specificity.
arXiv Detail & Related papers (2020-03-13T18:20:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.