Related papers: Double Oracle Neural Architecture Search for Game Theoretic Deep Learning Models

Double Oracle Neural Architecture Search for Game Theoretic Deep Learning Models

URL: http://arxiv.org/abs/2410.04764v1
Date: Mon, 7 Oct 2024 05:42:01 GMT
Title: Double Oracle Neural Architecture Search for Game Theoretic Deep Learning Models
Authors: Aye Phyu Phyu Aung, Xinrun Wang, Ruiyu Wang, Hau Chan, Bo An, Xiaoli Li, J. Senthilnath,
Abstract summary: We propose a new approach to train deep learning models using game theory concepts. We deploy a double-versarial framework using best response oracles. We show that all our variants have significant improvements in both subjective qualitative evaluation and quantitative metrics.
Score: 28.238075755838487
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: In this paper, we propose a new approach to train deep learning models using game theory concepts including Generative Adversarial Networks (GANs) and Adversarial Training (AT) where we deploy a double-oracle framework using best response oracles. GAN is essentially a two-player zero-sum game between the generator and the discriminator. The same concept can be applied to AT with attacker and classifier as players. Training these models is challenging as a pure Nash equilibrium may not exist and even finding the mixed Nash equilibrium is difficult as training algorithms for both GAN and AT have a large-scale strategy space. Extending our preliminary model DO-GAN, we propose the methods to apply the double oracle framework concept to Adversarial Neural Architecture Search (NAS for GAN) and Adversarial Training (NAS for AT) algorithms. We first generalize the players' strategies as the trained models of generator and discriminator from the best response oracles. We then compute the meta-strategies using a linear program. For scalability of the framework where multiple network models of best responses are stored in the memory, we prune the weakly-dominated players' strategies to keep the oracles from becoming intractable. Finally, we conduct experiments on MNIST, CIFAR-10 and TinyImageNet for DONAS-GAN. We also evaluate the robustness under FGSM and PGD attacks on CIFAR-10, SVHN and TinyImageNet for DONAS-AT. We show that all our variants have significant improvements in both subjective qualitative evaluation and quantitative metrics, compared with their respective base architectures.

Related papers

GraphFM: A Comprehensive Benchmark for Graph Foundation Model [33.157367455390144]
Foundation Models (FMs) serve as a general class for the development of artificial intelligence systems. Despite extensive research into self-supervised learning as the cornerstone of FMs, several outstanding issues persist. The extent of generalization capability on downstream tasks remains unclear. It is unknown how effectively these models can scale to large datasets.
arXiv Detail & Related papers (2024-06-12T15:10:44Z)
Towards Regression-Free Neural Networks for Diverse Compute Platforms [50.64489250972764]
We introduce REGression constrained Neural Architecture Search (REG-NAS) to design a family of highly accurate models that engender fewer negative flips. REG-NAS consists of two components: (1) A novel architecture constraint that enables a larger model to contain all the weights of the smaller one thus maximizing weight sharing. We demonstrate that regnas can successfully find desirable architectures with few negative flips in three popular architecture search spaces.
arXiv Detail & Related papers (2022-09-27T23:19:16Z)
Two Heads are Better than One: Robust Learning Meets Multi-branch Models [14.72099568017039]
We propose Branch Orthogonality adveRsarial Training (BORT) to obtain state-of-the-art performance with solely the original dataset for adversarial training. We evaluate our approach on CIFAR-10, CIFAR-100, and SVHN against ell_infty norm-bounded perturbations of size epsilon = 8/255, respectively.
arXiv Detail & Related papers (2022-08-17T05:42:59Z)
Unifying Language Learning Paradigms [96.35981503087567]
We present a unified framework for pre-training models that are universally effective across datasets and setups. We show how different pre-training objectives can be cast as one another and how interpolating between different objectives can be effective. Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.
arXiv Detail & Related papers (2022-05-10T19:32:20Z)
Self-Ensembling GAN for Cross-Domain Semantic Segmentation [107.27377745720243]
This paper proposes a self-ensembling generative adversarial network (SE-GAN) exploiting cross-domain data for semantic segmentation. In SE-GAN, a teacher network and a student network constitute a self-ensembling model for generating semantic segmentation maps, which together with a discriminator, forms a GAN. Despite its simplicity, we find SE-GAN can significantly boost the performance of adversarial training and enhance the stability of the model.
arXiv Detail & Related papers (2021-12-15T09:50:25Z)
AutoBERT-Zero: Evolving BERT Backbone from Scratch [94.89102524181986]
We propose an Operation-Priority Neural Architecture Search (OP-NAS) algorithm to automatically search for promising hybrid backbone architectures. We optimize both the search algorithm and evaluation of candidate models to boost the efficiency of our proposed OP-NAS. Experiments show that the searched architecture (named AutoBERT-Zero) significantly outperforms BERT and its variants of different model capacities in various downstream tasks.
arXiv Detail & Related papers (2021-07-15T16:46:01Z)
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search [100.28980854978768]
We present Block-wisely Self-supervised Neural Architecture Search (BossNAS) We factorize the search space into blocks and utilize a novel self-supervised training scheme, named ensemble bootstrapping, to train each block separately. We also present HyTra search space, a fabric-like hybrid CNN-transformer search space with searchable down-sampling positions.
arXiv Detail & Related papers (2021-03-23T10:05:58Z)
DO-GAN: A Double Oracle Framework for Generative Adversarial Networks [28.904057977044374]
We propose a new approach to train Generative Adversarial Networks (GANs) We deploy a double-oracle framework using the generator and discriminator oracles. We apply our framework to established GAN architectures such as vanilla GAN, Deep Convolutional GAN, Spectral Normalization GAN and Stacked GAN.
arXiv Detail & Related papers (2021-02-17T05:11:18Z)
Training Generative Adversarial Networks via stochastic Nash games [2.995087247817663]
Generative adversarial networks (GANs) are a class of generative models with two antagonistic neural networks: a generator and a discriminator. We show convergence to an exact solution when an increasing number of data is available. We also show convergence of an averaged variant of the SRFB algorithm to a neighborhood of the solution when only few samples are available.
arXiv Detail & Related papers (2020-10-17T09:07:40Z)
BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models [59.95091850331499]
We propose BigNAS, an approach that challenges the conventional wisdom that post-processing of the weights is necessary to get good prediction accuracies. Our discovered model family, BigNASModels, achieve top-1 accuracies ranging from 76.5% to 80.9%.
arXiv Detail & Related papers (2020-03-24T23:00:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.