Multi-Symmetry Ensembles: Improving Diversity and Generalization via
Opposing Symmetries
- URL: http://arxiv.org/abs/2303.02484v2
- Date: Mon, 19 Jun 2023 18:59:43 GMT
- Title: Multi-Symmetry Ensembles: Improving Diversity and Generalization via
Opposing Symmetries
- Authors: Charlotte Loh, Seungwook Han, Shivchander Sudalairaj, Rumen Dangovski,
Kai Xu, Florian Wenzel, Marin Soljacic, Akash Srivastava
- Abstract summary: We present Multi-Symmetry Ensembles (MSE), a framework for constructing diverse ensembles by capturing the multiplicity of hypotheses along symmetry axes.
MSE effectively captures the multiplicity of conflicting hypotheses that is often required in large, diverse datasets like ImageNet.
As a result of their inherent diversity, MSE improves classification performance, uncertainty quantification, and generalization across a series of transfer tasks.
- Score: 14.219011458423363
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep ensembles (DE) have been successful in improving model performance by
learning diverse members via the stochasticity of random initialization. While
recent works have attempted to promote further diversity in DE via
hyperparameters or regularizing loss functions, these methods primarily still
rely on a stochastic approach to explore the hypothesis space. In this work, we
present Multi-Symmetry Ensembles (MSE), a framework for constructing diverse
ensembles by capturing the multiplicity of hypotheses along symmetry axes,
which explore the hypothesis space beyond stochastic perturbations of model
weights and hyperparameters. We leverage recent advances in contrastive
representation learning to create models that separately capture opposing
hypotheses of invariant and equivariant functional classes and present a simple
ensembling approach to efficiently combine appropriate hypotheses for a given
task. We show that MSE effectively captures the multiplicity of conflicting
hypotheses that is often required in large, diverse datasets like ImageNet. As
a result of their inherent diversity, MSE improves classification performance,
uncertainty quantification, and generalization across a series of transfer
tasks.
Related papers
- Stability of Primal-Dual Gradient Flow Dynamics for Multi-Block Convex Optimization Problems [2.66854711376491]
Proposed dynamics are based on the proximal augmented Lagrangian.
We leverage various structural properties to establish global (exponential) convergence guarantees.
Our assumptions are much weaker than those required to prove (exponential) stability of various primal-dual dynamics.
arXiv Detail & Related papers (2024-08-28T17:43:18Z) - Multivariate Stochastic Dominance via Optimal Transport and Applications to Models Benchmarking [21.23500484100963]
We introduce a statistic that assesses almost dominance under the framework of Optimal Transport with a smooth cost.
We also propose a hypothesis testing framework as well as an efficient implementation using the Sinkhorn algorithm.
We showcase our method in comparing and benchmarking Large Language Models that are evaluated on multiple metrics.
arXiv Detail & Related papers (2024-06-10T16:14:50Z) - Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models [83.02797560769285]
Data-Free Meta-Learning (DFML) aims to derive knowledge from a collection of pre-trained models without accessing their original data.
Current methods often overlook the heterogeneity among pre-trained models, which leads to performance degradation due to task conflicts.
We propose Task Groupings Regularization, a novel approach that benefits from model heterogeneity by grouping and aligning conflicting tasks.
arXiv Detail & Related papers (2024-05-26T13:11:55Z) - The Common Stability Mechanism behind most Self-Supervised Learning
Approaches [64.40701218561921]
We provide a framework to explain the stability mechanism of different self-supervised learning techniques.
We discuss the working mechanism of contrastive techniques like SimCLR, non-contrastive techniques like BYOL, SWAV, SimSiam, Barlow Twins, and DINO.
We formulate different hypotheses and test them using the Imagenet100 dataset.
arXiv Detail & Related papers (2024-02-22T20:36:24Z) - Tasks Makyth Models: Machine Learning Assisted Surrogates for Tipping
Points [0.0]
We present a machine learning (ML)-assisted framework for detecting tipping points in the emergent behavior of complex systems.
We construct reduced-order models for the emergent dynamics at different scales.
We contrast the uses of the different models and the effort involved in learning them.
arXiv Detail & Related papers (2023-09-25T17:58:23Z) - A Pareto-optimal compositional energy-based model for sampling and
optimization of protein sequences [55.25331349436895]
Deep generative models have emerged as a popular machine learning-based approach for inverse problems in the life sciences.
These problems often require sampling new designs that satisfy multiple properties of interest in addition to learning the data distribution.
arXiv Detail & Related papers (2022-10-19T19:04:45Z) - A Variational Inference Approach to Inverse Problems with Gamma
Hyperpriors [60.489902135153415]
This paper introduces a variational iterative alternating scheme for hierarchical inverse problems with gamma hyperpriors.
The proposed variational inference approach yields accurate reconstruction, provides meaningful uncertainty quantification, and is easy to implement.
arXiv Detail & Related papers (2021-11-26T06:33:29Z) - Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma
Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result.
Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z) - Invariance-based Multi-Clustering of Latent Space Embeddings for
Equivariant Learning [12.770012299379099]
We present an approach to disentangle equivariance feature maps in a Lie group manifold by enforcing deep, group-invariant learning.
Our experiments show that this model effectively learns to disentangle the invariant and equivariant representations with significant improvements in the learning rate.
arXiv Detail & Related papers (2021-07-25T03:27:47Z) - Towards Multimodal Response Generation with Exemplar Augmentation and
Curriculum Optimization [73.45742420178196]
We propose a novel multimodal response generation framework with exemplar augmentation and curriculum optimization.
Our model achieves significant improvements compared to strong baselines in terms of diversity and relevance.
arXiv Detail & Related papers (2020-04-26T16:29:06Z) - Lifted Hybrid Variational Inference [31.441922284854893]
We investigate two approximate lifted variational approaches that are applicable to hybrid domains.
We demonstrate that the proposed variational methods are both scalable and can take advantage of approximate model symmetries.
We present a sufficient condition for the Bethe approximation to yield a non-trivial estimate over the marginal polytope.
arXiv Detail & Related papers (2020-01-08T22:29:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.