Related papers: Learning to Flow from Generative Pretext Tasks for Neural Architecture Encoding

Learning to Flow from Generative Pretext Tasks for Neural Architecture Encoding

URL: http://arxiv.org/abs/2510.18360v1
Date: Tue, 21 Oct 2025 07:29:15 GMT
Title: Learning to Flow from Generative Pretext Tasks for Neural Architecture Encoding
Authors: Sunwoo Kim, Hyunjin Hwang, Kijung Shin,
Abstract summary: We propose FGP, a novel pre-training method for neural architecture encoding.<n>FGP trains an encoder to reconstruct a flow surrogate, our proposed representation of the neural architecture's information flow.<n>Our experiments show that FGP boosts encoder performance by up to 106% in Precision-1%, compared to the same encoder trained solely with supervised learning.
Score: 32.929829644407256
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The performance of a deep learning model on a specific task and dataset depends heavily on its neural architecture, motivating considerable efforts to rapidly and accurately identify architectures suited to the target task and dataset. To achieve this, researchers use machine learning models-typically neural architecture encoders-to predict the performance of a neural architecture. Many state-of-the-art encoders aim to capture information flow within a neural architecture, which reflects how information moves through the forward pass and backpropagation, via a specialized model structure. However, due to their complicated structures, these flow-based encoders are significantly slower to process neural architectures compared to simpler encoders, presenting a notable practical challenge. To address this, we propose FGP, a novel pre-training method for neural architecture encoding that trains an encoder to capture the information flow without requiring specialized model structures. FGP trains an encoder to reconstruct a flow surrogate, our proposed representation of the neural architecture's information flow. Our experiments show that FGP boosts encoder performance by up to 106% in Precision-1%, compared to the same encoder trained solely with supervised learning.

Related papers

FlowerFormer: Empowering Neural Architecture Encoding using a Flow-aware Graph Transformer [22.99322654701362]
We introduce FlowerFormer, a powerful graph transformer that incorporates the information flows within a neural architecture. Our experiments demonstrate the superiority of FlowerFormer over existing neural encoding methods.
arXiv Detail & Related papers (2024-03-19T15:21:10Z)
Neural Architecture Codesign for Fast Bragg Peak Analysis [1.7081438846690533]
We develop an automated pipeline to streamline neural architecture codesign for fast, real-time Bragg peak analysis in microscopy. Our method employs neural architecture search and AutoML to enhance these models, including hardware costs, leading to the discovery of more hardware-efficient neural architectures.
arXiv Detail & Related papers (2023-12-10T19:42:18Z)
Set-based Neural Network Encoding Without Weight Tying [91.37161634310819]
We propose a neural network weight encoding method for network property prediction.<n>Our approach is capable of encoding neural networks in a model zoo of mixed architecture.<n>We introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture.
arXiv Detail & Related papers (2023-05-26T04:34:28Z)
NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction [37.357949900603295]
We propose a neural architecture representation model that can be used to estimate attributes holistically. Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
arXiv Detail & Related papers (2022-11-15T10:15:21Z)
Towards Theoretically Inspired Neural Initialization Optimization [66.04735385415427]
We propose a differentiable quantity, named GradCosine, with theoretical insights to evaluate the initial state of a neural network. We show that both the training and test performance of a network can be improved by maximizing GradCosine under norm constraint. Generalized from the sample-wise analysis into the real batch setting, NIO is able to automatically look for a better initialization with negligible cost.
arXiv Detail & Related papers (2022-10-12T06:49:16Z)
FlowNAS: Neural Architecture Search for Optical Flow Estimation [65.44079917247369]
We propose a neural architecture search method named FlowNAS to automatically find the better encoder architecture for flow estimation task. Experimental results show that the discovered architecture with the weights inherited from the super-network achieves 4.67% F1-all error on KITTI.
arXiv Detail & Related papers (2022-07-04T09:05:25Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Learning Structures for Deep Neural Networks [99.8331363309895]
We propose to adopt the efficient coding principle, rooted in information theory and developed in computational neuroscience. We show that sparse coding can effectively maximize the entropy of the output signals. Our experiments on a public image classification dataset demonstrate that using the structure learned from scratch by our proposed algorithm, one can achieve a classification accuracy comparable to the best expert-designed structure.
arXiv Detail & Related papers (2021-05-27T12:27:24Z)
The Nonlinearity Coefficient -- A Practical Guide to Neural Architecture Design [3.04585143845864]
We develop methods that can predict, without any training, whether an architecture will achieve a relatively high test or training error on a task after training. We then go on to explain the error in terms of the architecture definition itself and develop tools for modifying the architecture. Our first major contribution is to show that the 'degree of nonlinearity' of a neural architecture is a key causal driver behind its performance.
arXiv Detail & Related papers (2021-05-25T20:47:43Z)
Incremental Training of a Recurrent Neural Network Exploiting a Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning. We show how to extend the architecture of a simple RNN by separating its hidden state into different modules. We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z)
A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures. A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.