Mix and Mask Actor-Critic Methods
- URL: http://arxiv.org/abs/2106.13037v1
- Date: Thu, 24 Jun 2021 14:12:45 GMT
- Title: Mix and Mask Actor-Critic Methods
- Authors: Dom Huh
- Abstract summary: Shared feature spaces for actor-critic methods aims to capture generalized latent representations to be used by the policy and value function.
We present a novel feature-sharing framework to address these difficulties by introducing the mix and mask mechanisms and the distributional scalarization technique.
From our experimental results, we demonstrate significant performance improvements compared to alternative methods using separate networks and networks with a shared backbone.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Shared feature spaces for actor-critic methods aims to capture generalized
latent representations to be used by the policy and value function with the
hopes for a more stable and sample-efficient optimization. However, such a
paradigm present a number of challenges in practice, as parameters generating a
shared representation must learn off two distinct objectives, resulting in
competing updates and learning perturbations. In this paper, we present a novel
feature-sharing framework to address these difficulties by introducing the mix
and mask mechanisms and the distributional scalarization technique. These
mechanisms behaves dynamically to couple and decouple connected latent features
variably between the policy and value function, while the distributional
scalarization standardizes the two objectives using a probabilistic standpoint.
From our experimental results, we demonstrate significant performance
improvements compared to alternative methods using separate networks and
networks with a shared backbone.
Related papers
- Kaleidoscope: Learnable Masks for Heterogeneous Multi-agent Reinforcement Learning [14.01772209044574]
We introduce emphKaleidoscope, a novel adaptive partial parameter sharing scheme.
It promotes diversity among policy networks by encouraging discrepancy among these masks, without sacrificing the efficiencies of parameter sharing.
We extend Kaleidoscope to critic ensembles in the context of actor-critic algorithms, which could help improve value estimations.
arXiv Detail & Related papers (2024-10-11T05:22:54Z) - CHASE: Learning Convex Hull Adaptive Shift for Skeleton-based Multi-Entity Action Recognition [10.045163723630159]
CHASE operates as a sample-adaptive normalization method to mitigate inter-entity distribution discrepancies.
Our approach seamlessly adapts to single-entity backbones and boosts their performance in multi-entity scenarios.
arXiv Detail & Related papers (2024-10-09T17:55:43Z) - Interpetable Target-Feature Aggregation for Multi-Task Learning based on Bias-Variance Analysis [53.38518232934096]
Multi-task learning (MTL) is a powerful machine learning paradigm designed to leverage shared knowledge across tasks to improve generalization and performance.
We propose an MTL approach at the intersection between task clustering and feature transformation based on a two-phase iterative aggregation of targets and features.
In both phases, a key aspect is to preserve the interpretability of the reduced targets and features through the aggregation with the mean, which is motivated by applications to Earth science.
arXiv Detail & Related papers (2024-06-12T08:30:16Z) - Unifying Feature and Cost Aggregation with Transformers for Semantic and Visual Correspondence [51.54175067684008]
This paper introduces a Transformer-based integrative feature and cost aggregation network designed for dense matching tasks.
We first show that feature aggregation and cost aggregation exhibit distinct characteristics and reveal the potential for substantial benefits stemming from the judicious use of both aggregation processes.
Our framework is evaluated on standard benchmarks for semantic matching, and also applied to geometric matching, where we show that our approach achieves significant improvements compared to existing methods.
arXiv Detail & Related papers (2024-03-17T07:02:55Z) - Implicit Variational Inference for High-Dimensional Posteriors [7.924706533725115]
In variational inference, the benefits of Bayesian models rely on accurately capturing the true posterior distribution.
We propose using neural samplers that specify implicit distributions, which are well-suited for approximating complex multimodal and correlated posteriors.
Our approach introduces novel bounds for approximate inference using implicit distributions by locally linearising the neural sampler.
arXiv Detail & Related papers (2023-10-10T14:06:56Z) - Leveraging Diffusion Disentangled Representations to Mitigate Shortcuts
in Underspecified Visual Tasks [92.32670915472099]
We propose an ensemble diversification framework exploiting the generation of synthetic counterfactuals using Diffusion Probabilistic Models (DPMs)
We show that diffusion-guided diversification can lead models to avert attention from shortcut cues, achieving ensemble diversity performance comparable to previous methods requiring additional data collection.
arXiv Detail & Related papers (2023-10-03T17:37:52Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Multi-Modal Mutual Information Maximization: A Novel Approach for
Unsupervised Deep Cross-Modal Hashing [73.29587731448345]
We propose a novel method, dubbed Cross-Modal Info-Max Hashing (CMIMH)
We learn informative representations that can preserve both intra- and inter-modal similarities.
The proposed method consistently outperforms other state-of-the-art cross-modal retrieval methods.
arXiv Detail & Related papers (2021-12-13T08:58:03Z) - Mixing Consistent Deep Clustering [3.5786621294068373]
Good latent representations produce semantically mixed outputs when decoding linears of two latent representations.
We propose the Mixing Consistent Deep Clustering method which encourages representations to appear realistic.
We show that the proposed method can be added to existing autoencoders to further improve clustering performance.
arXiv Detail & Related papers (2020-11-03T19:47:06Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.