AdaXpert: Adapting Neural Architecture for Growing Data
- URL: http://arxiv.org/abs/2107.00254v1
- Date: Thu, 1 Jul 2021 07:22:05 GMT
- Title: AdaXpert: Adapting Neural Architecture for Growing Data
- Authors: Shuaicheng Niu, Jiaxiang Wu, Guanghui Xu, Yifan Zhang, Yong Guo,
Peilin Zhao, Peng Wang, Mingkui Tan
- Abstract summary: In real-world applications, data often come in a growing manner, where the data volume and the number of classes may increase dynamically.
Given the increasing data volume or the number of classes, one has to instantaneously adjust the neural model capacity to obtain promising performance.
Existing methods either ignore the growing nature of data or seek to independently search an optimal architecture for a given dataset.
- Score: 63.30393509048505
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In real-world applications, data often come in a growing manner, where the
data volume and the number of classes may increase dynamically. This will bring
a critical challenge for learning: given the increasing data volume or the
number of classes, one has to instantaneously adjust the neural model capacity
to obtain promising performance. Existing methods either ignore the growing
nature of data or seek to independently search an optimal architecture for a
given dataset, and thus are incapable of promptly adjusting the architectures
for the changed data. To address this, we present a neural architecture
adaptation method, namely Adaptation eXpert (AdaXpert), to efficiently adjust
previous architectures on the growing data. Specifically, we introduce an
architecture adjuster to generate a suitable architecture for each data
snapshot, based on the previous architecture and the different extent between
current and previous data distributions. Furthermore, we propose an adaptation
condition to determine the necessity of adjustment, thereby avoiding
unnecessary and time-consuming adjustments. Extensive experiments on two growth
scenarios (increasing data volume and number of classes) demonstrate the
effectiveness of the proposed method.
Related papers
- Growing Tiny Networks: Spotting Expressivity Bottlenecks and Fixing Them Optimally [2.645067871482715]
In machine learning tasks, one searches for an optimal function within a certain functional space.
This way forces the evolution of the function during training to lie within the realm of what is expressible with the chosen architecture.
We show that the information about desirable architectural changes, due to expressivity bottlenecks can be extracted from %the backpropagation.
arXiv Detail & Related papers (2024-05-30T08:23:56Z) - Implicitly Guided Design with PropEn: Match your Data to Follow the Gradient [52.2669490431145]
PropEn is inspired by'matching', which enables implicit guidance without training a discriminator.
We show that training with a matched dataset approximates the gradient of the property of interest while remaining within the data distribution.
arXiv Detail & Related papers (2024-05-28T11:30:19Z) - Reusable Architecture Growth for Continual Stereo Matching [92.36221737921274]
We introduce a Reusable Architecture Growth (RAG) framework to learn new scenes continually in both supervised and self-supervised manners.
RAG can maintain high reusability during growth by reusing previous units while obtaining good performance.
We also present a Scene Router module to adaptively select the scene-specific architecture path at inference.
arXiv Detail & Related papers (2024-03-30T13:24:58Z) - Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling [4.190836962132713]
This paper introduces Orchid, a novel architecture designed to address the quadratic complexity of traditional attention mechanisms.
At the core of this architecture lies a new data-dependent global convolution layer, which contextually adapts its conditioned kernel on input sequence.
We evaluate the proposed model across multiple domains, including language modeling and image classification, to highlight its performance and generality.
arXiv Detail & Related papers (2024-02-28T17:36:45Z) - MSTAR: Multi-Scale Backbone Architecture Search for Timeseries
Classification [0.41185655356953593]
We propose a novel multi-scale search space and a framework for Neural architecture search (NAS)
We show that our model can serve as a backbone to employ a powerful Transformer module with both untrained and pre-trained weights.
Our search space reaches the state-of-the-art performance on four datasets on four different domains.
arXiv Detail & Related papers (2024-02-21T13:59:55Z) - Temporal Convolution Domain Adaptation Learning for Crops Growth
Prediction [5.966652553573454]
We construct an innovative network architecture based on domain adaptation learning to predict crops growth curves with limited available crop data.
We are the first to use the temporal convolution filters as the backbone to construct a domain adaptation network architecture.
Results show that the proposed temporal convolution-based network architecture outperforms all benchmarks not only in accuracy but also in model size and convergence rate.
arXiv Detail & Related papers (2022-02-24T14:22:36Z) - Deep invariant networks with differentiable augmentation layers [87.22033101185201]
Methods for learning data augmentation policies require held-out data and are based on bilevel optimization problems.
We show that our approach is easier and faster to train than modern automatic data augmentation techniques.
arXiv Detail & Related papers (2022-02-04T14:12:31Z) - Data Scaling Laws in NMT: The Effect of Noise and Architecture [59.767899982937756]
We study the effect of varying the architecture and training data quality on the data scaling properties of Neural Machine Translation (NMT)
We find that the data scaling exponents are minimally impacted, suggesting that marginally worse architectures or training data can be compensated for by adding more data.
arXiv Detail & Related papers (2022-02-04T06:53:49Z) - AutoAdapt: Automated Segmentation Network Search for Unsupervised Domain
Adaptation [4.793219747021116]
We perform neural architecture search (NAS) to provide architecture-level perspective and analysis for domain adaptation.
We propose bridging this gap by using maximum mean discrepancy and regional weighted entropy to estimate the accuracy metric.
arXiv Detail & Related papers (2021-06-24T17:59:02Z) - Rethinking Architecture Design for Tackling Data Heterogeneity in
Federated Learning [53.73083199055093]
We show that attention-based architectures (e.g., Transformers) are fairly robust to distribution shifts.
Our experiments show that replacing convolutional networks with Transformers can greatly reduce catastrophic forgetting of previous devices.
arXiv Detail & Related papers (2021-06-10T21:04:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.