Compositional Risk Minimization
- URL: http://arxiv.org/abs/2410.06303v2
- Date: Mon, 10 Feb 2025 23:16:02 GMT
- Title: Compositional Risk Minimization
- Authors: Divyat Mahajan, Mohammad Pezeshki, Charles Arnal, Ioannis Mitliagkas, Kartik Ahuja, Pascal Vincent,
- Abstract summary: We tackle a challenging form of distribution shift, termed compositional shift, where some attribute combinations are completely absent at training but present in the test distribution.
This shift tests the model's ability to generalize compositionally to novel attribute combinations in discriminative tasks.
We derive a simple alternative to empirical risk minimization termed compositional risk minimization.
- Score: 21.81773851522387
- License:
- Abstract: Compositional generalization is a crucial step towards developing data-efficient intelligent machines that generalize in human-like ways. In this work, we tackle a challenging form of distribution shift, termed compositional shift, where some attribute combinations are completely absent at training but present in the test distribution. This shift tests the model's ability to generalize compositionally to novel attribute combinations in discriminative tasks. We model the data with flexible additive energy distributions, where each energy term represents an attribute, and derive a simple alternative to empirical risk minimization termed compositional risk minimization (CRM). We first train an additive energy classifier to predict the multiple attributes and then adjust this classifier to tackle compositional shifts. We provide an extensive theoretical analysis of CRM, where we show that our proposal extrapolates to special affine hulls of seen attribute combinations. Empirical evaluations on benchmark datasets confirms the improved robustness of CRM compared to other methods from the literature designed to tackle various forms of subpopulation shifts.
Related papers
- Bridging the inference gap in Mutimodal Variational Autoencoders [6.246098300155483]
Multimodal Variational Autoencoders offer versatile and scalable methods for generating unobserved modalities from observed ones.
Recent models using mixturesof-experts aggregation suffer from theoretically grounded limitations that restrict their generation quality on complex datasets.
We propose a novel interpretable model able to learn both joint and conditional distributions without introducing mixture aggregation.
arXiv Detail & Related papers (2025-02-06T10:43:55Z) - MetaRM: Shifted Distributions Alignment via Meta-Learning [52.94381279744458]
Reinforcement Learning from Human Feedback (RLHF) in language model alignment is critically dependent on the capability of the reward model (RM)
We introduce MetaRM, a method leveraging meta-learning to align the RM with the shifted environment distribution.
Extensive experiments demonstrate that MetaRM significantly improves the RM's distinguishing ability in iterative RLHF optimization.
arXiv Detail & Related papers (2024-05-01T10:43:55Z) - Correlation-Decoupled Knowledge Distillation for Multimodal Sentiment Analysis with Incomplete Modalities [16.69453837626083]
We propose a Correlation-decoupled Knowledge Distillation (CorrKD) framework for the Multimodal Sentiment Analysis (MSA) task under uncertain missing modalities.
We present a sample-level contrastive distillation mechanism that transfers comprehensive knowledge containing cross-sample correlations to reconstruct missing semantics.
We design a response-disentangled consistency distillation strategy to optimize the sentiment decision boundaries of the student network.
arXiv Detail & Related papers (2024-04-25T09:35:09Z) - Coupled generator decomposition for fusion of electro- and magnetoencephalography data [1.7102695043811291]
Data fusion modeling can identify common features across diverse data sources while accounting for source-specific variability.
We introduce the concept of a textitcoupled generator decomposition and demonstrate how it generalizes sparse principal component analysis for data fusion.
arXiv Detail & Related papers (2024-03-02T12:09:16Z) - Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation.
We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z) - Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization [61.39201891894024]
Group distributionally robust optimization (group DRO) can minimize the worst-case loss over pre-defined groups.
We reformulate the group DRO framework by proposing Q-Diversity.
Characterized by an interactive training mode, Q-Diversity relaxes the group identification from annotation into direct parameterization.
arXiv Detail & Related papers (2023-05-20T07:02:27Z) - ER: Equivariance Regularizer for Knowledge Graph Completion [107.51609402963072]
We propose a new regularizer, namely, Equivariance Regularizer (ER)
ER can enhance the generalization ability of the model by employing the semantic equivariance between the head and tail entities.
The experimental results indicate a clear and substantial improvement over the state-of-the-art relation prediction methods.
arXiv Detail & Related papers (2022-06-24T08:18:05Z) - CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE)
At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales.
We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z) - Directly Training Joint Energy-Based Models for Conditional Synthesis
and Calibrated Prediction of Multi-Attribute Data [9.389098132764431]
We show that architectures for multi-attribute prediction can be reinterpreted as energy-based models.
We propose a simple extension which expands the capabilities of EBMs to generate accurate conditional samples.
We find our models are capable of both accurate, calibrated predictions and high-quality conditional synthesis of novel attribute combinations.
arXiv Detail & Related papers (2021-07-19T22:19:41Z) - Heterogeneous Risk Minimization [25.5458915855661]
Invariant learning methods for out-of-distribution generalization have been proposed by leveraging multiple training environments to find invariant relationships.
Modern datasets are assembled by merging data from multiple sources without explicit source labels.
We propose Heterogeneous Risk Minimization (HRM) framework to achieve joint learning of latent heterogeneity among the data and invariant relationship.
arXiv Detail & Related papers (2021-05-09T02:51:36Z) - Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise
Rollouts [52.844741540236285]
This paper investigates the model-based methods in multi-agent reinforcement learning (MARL)
We propose a novel decentralized model-based MARL method, named Adaptive Opponent-wise Rollout Policy (AORPO)
arXiv Detail & Related papers (2021-05-07T16:20:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.