Related papers: Multi-Symmetry Ensembles: Improving Diversity and Generalization via Opposing Symmetries

Multi-Symmetry Ensembles: Improving Diversity and Generalization via Opposing Symmetries

URL: http://arxiv.org/abs/2303.02484v2
Date: Mon, 19 Jun 2023 18:59:43 GMT
Title: Multi-Symmetry Ensembles: Improving Diversity and Generalization via Opposing Symmetries
Authors: Charlotte Loh, Seungwook Han, Shivchander Sudalairaj, Rumen Dangovski, Kai Xu, Florian Wenzel, Marin Soljacic, Akash Srivastava
Abstract summary: We present Multi-Symmetry Ensembles (MSE), a framework for constructing diverse ensembles by capturing the multiplicity of hypotheses along symmetry axes. MSE effectively captures the multiplicity of conflicting hypotheses that is often required in large, diverse datasets like ImageNet. As a result of their inherent diversity, MSE improves classification performance, uncertainty quantification, and generalization across a series of transfer tasks.
Score: 14.219011458423363
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep ensembles (DE) have been successful in improving model performance by learning diverse members via the stochasticity of random initialization. While recent works have attempted to promote further diversity in DE via hyperparameters or regularizing loss functions, these methods primarily still rely on a stochastic approach to explore the hypothesis space. In this work, we present Multi-Symmetry Ensembles (MSE), a framework for constructing diverse ensembles by capturing the multiplicity of hypotheses along symmetry axes, which explore the hypothesis space beyond stochastic perturbations of model weights and hyperparameters. We leverage recent advances in contrastive representation learning to create models that separately capture opposing hypotheses of invariant and equivariant functional classes and present a simple ensembling approach to efficiently combine appropriate hypotheses for a given task. We show that MSE effectively captures the multiplicity of conflicting hypotheses that is often required in large, diverse datasets like ImageNet. As a result of their inherent diversity, MSE improves classification performance, uncertainty quantification, and generalization across a series of transfer tasks.

Related papers

Preconditioned Inexact Stochastic ADMM for Deep Model [35.37705488695026]
This paper develops an algorithm, PISA, which enables scalable parallel computing and supports various second-moment schemes. Grounded in rigorous theoretical guarantees, the algorithm converges under the sole assumption of Lipschitz of the gradient. Comprehensive experimental evaluations for or fine-tuning diverse FMs, including vision models, large language models, reinforcement learning models, generative adversarial networks, and recurrent neural networks, demonstrate its superior numerical performance compared to various state-of-the-art Directions.
arXiv Detail & Related papers (2025-02-15T12:28:51Z)
Symmetries-enhanced Multi-Agent Reinforcement Learning [25.383183391244373]
Multi-agent reinforcement learning has emerged as a powerful framework for enabling agents to learn complex, coordinated behaviors. Recent advancements have sought to alleviate those issues by embedding intrinsic symmetries of the systems in the policy. This paper presents a novel framework for embedding extrinsic symmetries in multi-agent system dynamics.
arXiv Detail & Related papers (2025-01-02T08:41:31Z)
Stability of Primal-Dual Gradient Flow Dynamics for Multi-Block Convex Optimization Problems [2.66854711376491]
Proposed dynamics are based on the proximal augmented Lagrangian. We leverage various structural properties to establish global (exponential) convergence guarantees. Our assumptions are much weaker than those required to prove (exponential) stability of various primal-dual dynamics.
arXiv Detail & Related papers (2024-08-28T17:43:18Z)
Multivariate Stochastic Dominance via Optimal Transport and Applications to Models Benchmarking [21.23500484100963]
We introduce a statistic that assesses almost dominance under the framework of Optimal Transport with a smooth cost. We also propose a hypothesis testing framework as well as an efficient implementation using the Sinkhorn algorithm. We showcase our method in comparing and benchmarking Large Language Models that are evaluated on multiple metrics.
arXiv Detail & Related papers (2024-06-10T16:14:50Z)
Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models [83.02797560769285]
Data-Free Meta-Learning (DFML) aims to derive knowledge from a collection of pre-trained models without accessing their original data. Current methods often overlook the heterogeneity among pre-trained models, which leads to performance degradation due to task conflicts. We propose Task Groupings Regularization, a novel approach that benefits from model heterogeneity by grouping and aligning conflicting tasks.
arXiv Detail & Related papers (2024-05-26T13:11:55Z)
The Common Stability Mechanism behind most Self-Supervised Learning Approaches [64.40701218561921]
We provide a framework to explain the stability mechanism of different self-supervised learning techniques. We discuss the working mechanism of contrastive techniques like SimCLR, non-contrastive techniques like BYOL, SWAV, SimSiam, Barlow Twins, and DINO. We formulate different hypotheses and test them using the Imagenet100 dataset.
arXiv Detail & Related papers (2024-02-22T20:36:24Z)
Tasks Makyth Models: Machine Learning Assisted Surrogates for Tipping Points [0.0]
We present a machine learning (ML)-assisted framework for detecting tipping points in the emergent behavior of complex systems. We construct reduced-order models for the emergent dynamics at different scales. We contrast the uses of the different models and the effort involved in learning them.
arXiv Detail & Related papers (2023-09-25T17:58:23Z)
A Pareto-optimal compositional energy-based model for sampling and optimization of protein sequences [55.25331349436895]
Deep generative models have emerged as a popular machine learning-based approach for inverse problems in the life sciences. These problems often require sampling new designs that satisfy multiple properties of interest in addition to learning the data distribution.
arXiv Detail & Related papers (2022-10-19T19:04:45Z)
A Variational Inference Approach to Inverse Problems with Gamma Hyperpriors [60.489902135153415]
This paper introduces a variational iterative alternating scheme for hierarchical inverse problems with gamma hyperpriors. The proposed variational inference approach yields accurate reconstruction, provides meaningful uncertainty quantification, and is easy to implement.
arXiv Detail & Related papers (2021-11-26T06:33:29Z)
Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result. Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z)
Invariance-based Multi-Clustering of Latent Space Embeddings for Equivariant Learning [12.770012299379099]
We present an approach to disentangle equivariance feature maps in a Lie group manifold by enforcing deep, group-invariant learning. Our experiments show that this model effectively learns to disentangle the invariant and equivariant representations with significant improvements in the learning rate.
arXiv Detail & Related papers (2021-07-25T03:27:47Z)
Towards Multimodal Response Generation with Exemplar Augmentation and Curriculum Optimization [73.45742420178196]
We propose a novel multimodal response generation framework with exemplar augmentation and curriculum optimization. Our model achieves significant improvements compared to strong baselines in terms of diversity and relevance.
arXiv Detail & Related papers (2020-04-26T16:29:06Z)
Lifted Hybrid Variational Inference [31.441922284854893]
We investigate two approximate lifted variational approaches that are applicable to hybrid domains. We demonstrate that the proposed variational methods are both scalable and can take advantage of approximate model symmetries. We present a sufficient condition for the Bethe approximation to yield a non-trivial estimate over the marginal polytope.
arXiv Detail & Related papers (2020-01-08T22:29:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.