Bayesian Quadrature for Neural Ensemble Search
- URL: http://arxiv.org/abs/2303.08874v2
- Date: Fri, 17 Mar 2023 16:59:46 GMT
- Title: Bayesian Quadrature for Neural Ensemble Search
- Authors: Saad Hamid, Xingchen Wan, Martin J{\o}rgensen, Binxin Ru, Michael
Osborne
- Abstract summary: Existing approaches struggle when the architecture likelihood surface has dispersed, narrow peaks.
By viewing ensembling as approximately marginalising over architectures we construct ensembles using the tools of Bayesian Quadrature.
We show empirically -- in terms of test likelihood, accuracy, and expected calibration error -- that our method outperforms state-of-the-art baselines.
- Score: 9.58527004004275
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ensembling can improve the performance of Neural Networks, but existing
approaches struggle when the architecture likelihood surface has dispersed,
narrow peaks. Furthermore, existing methods construct equally weighted
ensembles, and this is likely to be vulnerable to the failure modes of the
weaker architectures. By viewing ensembling as approximately marginalising over
architectures we construct ensembles using the tools of Bayesian Quadrature --
tools which are well suited to the exploration of likelihood surfaces with
dispersed, narrow peaks. Additionally, the resulting ensembles consist of
architectures weighted commensurate with their performance. We show empirically
-- in terms of test likelihood, accuracy, and expected calibration error --
that our method outperforms state-of-the-art baselines, and verify via ablation
studies that its components do so independently.
Related papers
- Mechanistic Design and Scaling of Hybrid Architectures [114.3129802943915]
We identify and test new hybrid architectures constructed from a variety of computational primitives.
We experimentally validate the resulting architectures via an extensive compute-optimal and a new state-optimal scaling law analysis.
We find MAD synthetics to correlate with compute-optimal perplexity, enabling accurate evaluation of new architectures.
arXiv Detail & Related papers (2024-03-26T16:33:12Z) - Unsupervised Graph Neural Architecture Search with Disentangled
Self-supervision [51.88848982611515]
Unsupervised graph neural architecture search remains unexplored in the literature.
We propose a novel Disentangled Self-supervised Graph Neural Architecture Search model.
Our model is able to achieve state-of-the-art performance against several baseline methods in an unsupervised manner.
arXiv Detail & Related papers (2024-03-08T05:23:55Z) - Pushing Boundaries: Mixup's Influence on Neural Collapse [3.6919724596215615]
Mixup is a data augmentation strategy that employs convex combinations of training instances and their respective labels to augment the robustness and calibration of deep neural networks.
This study investigates the last-layer activations of training data for deep networks subjected to mixup.
We show that mixup's last-layer activations predominantly converge to a distinctive configuration different than one might expect.
arXiv Detail & Related papers (2024-02-09T04:01:25Z) - Revisiting Generative Adversarial Networks for Binary Semantic
Segmentation on Imbalanced Datasets [20.538287907723713]
Anomalous crack region detection is a typical binary semantic segmentation task, which aims to detect pixels representing cracks on pavement surface images automatically by algorithms.
Existing deep learning-based methods have achieved outcoming results on specific public pavement datasets, but the performance would deteriorate dramatically on imbalanced datasets.
We propose a deep learning framework based on conditional Generative Adversarial Networks (cGANs) for the anomalous crack region detection tasks at the pixel level.
arXiv Detail & Related papers (2024-02-03T19:24:40Z) - Distributionally Robust Fair Principal Components via Geodesic Descents [16.440434996206623]
In consequential domains such as college admission, healthcare and credit approval, it is imperative to take into account emerging criteria such as the fairness and the robustness of the learned projection.
We propose a distributionally robust optimization problem for principal component analysis which internalizes a fairness criterion in the objective function.
Our experimental results on real-world datasets show the merits of our proposed method over state-of-the-art baselines.
arXiv Detail & Related papers (2022-02-07T11:08:13Z) - Disentangling Neural Architectures and Weights: A Case Study in
Supervised Classification [8.976788958300766]
This work investigates the problem of disentangling the role of the neural structure and its edge weights.
We show that well-trained architectures may not need any link-specific fine-tuning of the weights.
We use a novel and computationally efficient method that translates the hard architecture-search problem into a feasible optimization problem.
arXiv Detail & Related papers (2020-09-11T11:22:22Z) - Learning perturbation sets for robust machine learning [97.6757418136662]
We use a conditional generator that defines the perturbation set over a constrained region of the latent space.
We measure the quality of our learned perturbation sets both quantitatively and qualitatively.
We leverage our learned perturbation sets to train models which are empirically and certifiably robust to adversarial image corruptions and adversarial lighting variations.
arXiv Detail & Related papers (2020-07-16T16:39:54Z) - Neural Ensemble Search for Uncertainty Estimation and Dataset Shift [67.57720300323928]
Ensembles of neural networks achieve superior performance compared to stand-alone networks in terms of accuracy, uncertainty calibration and robustness to dataset shift.
We propose two methods for automatically constructing ensembles with emphvarying architectures.
We show that the resulting ensembles outperform deep ensembles not only in terms of accuracy but also uncertainty calibration and robustness to dataset shift.
arXiv Detail & Related papers (2020-06-15T17:38:15Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z) - Image Matching across Wide Baselines: From Paper to Practice [80.9424750998559]
We introduce a comprehensive benchmark for local features and robust estimation algorithms.
Our pipeline's modular structure allows easy integration, configuration, and combination of different methods.
We show that with proper settings, classical solutions may still outperform the perceived state of the art.
arXiv Detail & Related papers (2020-03-03T15:20:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.