Multi-Symmetry Ensembles: Improving Diversity and Generalization via
Opposing Symmetries
- URL: http://arxiv.org/abs/2303.02484v2
- Date: Mon, 19 Jun 2023 18:59:43 GMT
- Title: Multi-Symmetry Ensembles: Improving Diversity and Generalization via
Opposing Symmetries
- Authors: Charlotte Loh, Seungwook Han, Shivchander Sudalairaj, Rumen Dangovski,
Kai Xu, Florian Wenzel, Marin Soljacic, Akash Srivastava
- Abstract summary: We present Multi-Symmetry Ensembles (MSE), a framework for constructing diverse ensembles by capturing the multiplicity of hypotheses along symmetry axes.
MSE effectively captures the multiplicity of conflicting hypotheses that is often required in large, diverse datasets like ImageNet.
As a result of their inherent diversity, MSE improves classification performance, uncertainty quantification, and generalization across a series of transfer tasks.
- Score: 14.219011458423363
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep ensembles (DE) have been successful in improving model performance by
learning diverse members via the stochasticity of random initialization. While
recent works have attempted to promote further diversity in DE via
hyperparameters or regularizing loss functions, these methods primarily still
rely on a stochastic approach to explore the hypothesis space. In this work, we
present Multi-Symmetry Ensembles (MSE), a framework for constructing diverse
ensembles by capturing the multiplicity of hypotheses along symmetry axes,
which explore the hypothesis space beyond stochastic perturbations of model
weights and hyperparameters. We leverage recent advances in contrastive
representation learning to create models that separately capture opposing
hypotheses of invariant and equivariant functional classes and present a simple
ensembling approach to efficiently combine appropriate hypotheses for a given
task. We show that MSE effectively captures the multiplicity of conflicting
hypotheses that is often required in large, diverse datasets like ImageNet. As
a result of their inherent diversity, MSE improves classification performance,
uncertainty quantification, and generalization across a series of transfer
tasks.
Related papers
- Preconditioned Inexact Stochastic ADMM for Deep Model [35.37705488695026]
This paper develops an algorithm, PISA, which enables scalable parallel computing and supports various second-moment schemes.
Grounded in rigorous theoretical guarantees, the algorithm converges under the sole assumption of Lipschitz of the gradient.
Comprehensive experimental evaluations for or fine-tuning diverse FMs, including vision models, large language models, reinforcement learning models, generative adversarial networks, and recurrent neural networks, demonstrate its superior numerical performance compared to various state-of-the-art Directions.
arXiv Detail & Related papers (2025-02-15T12:28:51Z) - Symmetries-enhanced Multi-Agent Reinforcement Learning [25.383183391244373]
Multi-agent reinforcement learning has emerged as a powerful framework for enabling agents to learn complex, coordinated behaviors.
Recent advancements have sought to alleviate those issues by embedding intrinsic symmetries of the systems in the policy.
This paper presents a novel framework for embedding extrinsic symmetries in multi-agent system dynamics.
arXiv Detail & Related papers (2025-01-02T08:41:31Z) - Stability of Primal-Dual Gradient Flow Dynamics for Multi-Block Convex Optimization Problems [2.66854711376491]
Proposed dynamics are based on the proximal augmented Lagrangian.
We leverage various structural properties to establish global (exponential) convergence guarantees.
Our assumptions are much weaker than those required to prove (exponential) stability of various primal-dual dynamics.
arXiv Detail & Related papers (2024-08-28T17:43:18Z) - Multivariate Stochastic Dominance via Optimal Transport and Applications to Models Benchmarking [21.23500484100963]
We introduce a statistic that assesses almost dominance under the framework of Optimal Transport with a smooth cost.
We also propose a hypothesis testing framework as well as an efficient implementation using the Sinkhorn algorithm.
We showcase our method in comparing and benchmarking Large Language Models that are evaluated on multiple metrics.
arXiv Detail & Related papers (2024-06-10T16:14:50Z) - Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models [83.02797560769285]
Data-Free Meta-Learning (DFML) aims to derive knowledge from a collection of pre-trained models without accessing their original data.
Current methods often overlook the heterogeneity among pre-trained models, which leads to performance degradation due to task conflicts.
arXiv Detail & Related papers (2024-05-26T13:11:55Z) - The Common Stability Mechanism behind most Self-Supervised Learning
Approaches [64.40701218561921]
We provide a framework to explain the stability mechanism of different self-supervised learning techniques.
We discuss the working mechanism of contrastive techniques like SimCLR, non-contrastive techniques like BYOL, SWAV, SimSiam, Barlow Twins, and DINO.
We formulate different hypotheses and test them using the Imagenet100 dataset.
arXiv Detail & Related papers (2024-02-22T20:36:24Z) - A Pareto-optimal compositional energy-based model for sampling and
optimization of protein sequences [55.25331349436895]
Deep generative models have emerged as a popular machine learning-based approach for inverse problems in the life sciences.
These problems often require sampling new designs that satisfy multiple properties of interest in addition to learning the data distribution.
arXiv Detail & Related papers (2022-10-19T19:04:45Z) - A Variational Inference Approach to Inverse Problems with Gamma
Hyperpriors [60.489902135153415]
This paper introduces a variational iterative alternating scheme for hierarchical inverse problems with gamma hyperpriors.
The proposed variational inference approach yields accurate reconstruction, provides meaningful uncertainty quantification, and is easy to implement.
arXiv Detail & Related papers (2021-11-26T06:33:29Z) - Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma
Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result.
Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z) - Invariance-based Multi-Clustering of Latent Space Embeddings for
Equivariant Learning [12.770012299379099]
We present an approach to disentangle equivariance feature maps in a Lie group manifold by enforcing deep, group-invariant learning.
Our experiments show that this model effectively learns to disentangle the invariant and equivariant representations with significant improvements in the learning rate.
arXiv Detail & Related papers (2021-07-25T03:27:47Z) - Towards Multimodal Response Generation with Exemplar Augmentation and
Curriculum Optimization [73.45742420178196]
We propose a novel multimodal response generation framework with exemplar augmentation and curriculum optimization.
Our model achieves significant improvements compared to strong baselines in terms of diversity and relevance.
arXiv Detail & Related papers (2020-04-26T16:29:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.