Related papers: Understanding Patterns of Deep Learning ModelEvolution in Network Architecture Search

Understanding Patterns of Deep Learning ModelEvolution in Network Architecture Search

URL: http://arxiv.org/abs/2309.12576v1
Date: Fri, 22 Sep 2023 02:12:47 GMT
Title: Understanding Patterns of Deep Learning ModelEvolution in Network Architecture Search
Authors: Robert Underwood, Meghana Madhastha, Randal Burns, Bogdan Nicolae
Abstract summary: We show how the evolution of the model structure is influenced by the regularized evolution algorithm. We describe how evolutionary patterns appear in distributed settings and opportunities for caching and improved scheduling.
Score: 0.8124699127636158
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Network Architecture Search and specifically Regularized Evolution is a common way to refine the structure of a deep learning model.However, little is known about how models empirically evolve over time which has design implications for designing caching policies, refining the search algorithm for particular applications, and other important use cases.In this work, we algorithmically analyze and quantitatively characterize the patterns of model evolution for a set of models from the Candle project and the Nasbench-201 search space.We show how the evolution of the model structure is influenced by the regularized evolution algorithm. We describe how evolutionary patterns appear in distributed settings and opportunities for caching and improved scheduling. Lastly, we describe the conditions that affect when particular model architectures rise and fall in popularity based on their frequency of acting as a donor in a sliding window.

Related papers

A Survey of Model Architectures in Information Retrieval [64.75808744228067]
We focus on two key aspects: backbone models for feature extraction and end-to-end system architectures for relevance estimation. We trace the development from traditional term-based methods to modern neural approaches, particularly highlighting the impact of transformer-based models and subsequent large language models (LLMs) We conclude by discussing emerging challenges and future directions, including architectural optimizations for performance and scalability, handling of multimodal, multilingual data, and adaptation to novel application domains beyond traditional search paradigms.
arXiv Detail & Related papers (2025-02-20T18:42:58Z)
Mining Frequent Structures in Conceptual Models [2.841785306638839]
We propose a general approach to the problem of discovering frequent structures in conceptual modeling languages. We use the combination of a frequent subgraph mining algorithm and graph manipulation techniques. The primary objective is to offer a support facility for language engineers.
arXiv Detail & Related papers (2024-06-11T10:24:02Z)
Internal Representations of Vision Models Through the Lens of Frames on Data Manifolds [8.67467876089153]
We present a new approach to studying such representations inspired by the idea of a frame on the tangent bundle of a manifold. Our construction, which we call a neural frame, is formed by assembling a set of vectors representing specific types of perturbations of a data point. Using neural frames, we make observations about the way that models process, layer-by-layer, specific modes of variation within a small neighborhood of a datapoint.
arXiv Detail & Related papers (2022-11-19T01:48:19Z)
Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test. We train a variational inference model to predict the causal structure from observational/interventional data. Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z)
Model-Based Deep Learning: On the Intersection of Deep Learning and Optimization [101.32332941117271]
Decision making algorithms are used in a multitude of different applications. Deep learning approaches that use highly parametric architectures tuned from data without relying on mathematical models are becoming increasingly popular. Model-based optimization and data-centric deep learning are often considered to be distinct disciplines.
arXiv Detail & Related papers (2022-05-05T13:40:08Z)
Complex Evolutional Pattern Learning for Temporal Knowledge Graph Reasoning [60.94357727688448]
TKG reasoning aims to predict potential facts in the future given the historical KG sequences. The evolutional patterns are complex in two aspects, length-diversity and time-variability. We propose a new model, called Complex Evolutional Network (CEN), which uses a length-aware Convolutional Neural Network (CNN) to handle evolutional patterns of different lengths.
arXiv Detail & Related papers (2022-03-15T11:02:55Z)
Fast and scalable neuroevolution deep learning architecture search for multivariate anomaly detection [0.0]
The work concentrates on improvements to multi-level neuroevolution approach for anomaly detection. The presented framework can be used as an efficient learning network architecture method for any different unsupervised task.
arXiv Detail & Related papers (2021-12-10T16:14:43Z)
Sparse Flows: Pruning Continuous-depth Models [107.98191032466544]
We show that pruning improves generalization for neural ODEs in generative modeling. We also show that pruning finds minimal and efficient neural ODE representations with up to 98% less parameters compared to the original network, without loss of accuracy.
arXiv Detail & Related papers (2021-06-24T01:40:17Z)
Redefining Neural Architecture Search of Heterogeneous Multi-Network Models by Characterizing Variation Operators and Model Components [71.03032589756434]
We investigate the effect of different variation operators in a complex domain, that of multi-network heterogeneous neural models. We characterize both the variation operators, according to their effect on the complexity and performance of the model; and the models, relying on diverse metrics which estimate the quality of the different parts composing it.
arXiv Detail & Related papers (2021-06-16T17:12:26Z)
Towards a Predictive Processing Implementation of the Common Model of Cognition [79.63867412771461]
We describe an implementation of the common model of cognition grounded in neural generative coding and holographic associative memory. The proposed system creates the groundwork for developing agents that learn continually from diverse tasks as well as model human performance at larger scales.
arXiv Detail & Related papers (2021-05-15T22:55:23Z)
EPNE: Evolutionary Pattern Preserving Network Embedding [26.06068388979255]
We propose EPNE, a temporal network embedding model preserving evolutionary patterns of the local structure of nodes. With the adequate modeling of temporal information, our model is able to outperform other competitive methods in various prediction tasks.
arXiv Detail & Related papers (2020-09-24T06:31:14Z)
Factorized Deep Generative Models for Trajectory Generation with Spatiotemporal-Validity Constraints [10.960924101404498]
Deep generative models for trajectory data can learn expressively explanatory models for sophisticated latent patterns. We first propose novel deep generative models factorizing time-variant and time-invariant latent variables. We then develop new inference strategies based on variational inference and constrained optimization to thetemporal validity.
arXiv Detail & Related papers (2020-09-20T02:06:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.