MaDi: Learning to Mask Distractions for Generalization in Visual Deep
Reinforcement Learning
- URL: http://arxiv.org/abs/2312.15339v1
- Date: Sat, 23 Dec 2023 20:11:05 GMT
- Title: MaDi: Learning to Mask Distractions for Generalization in Visual Deep
Reinforcement Learning
- Authors: Bram Grooten, Tristan Tomilin, Gautham Vasan, Matthew E. Taylor, A.
Rupam Mahmood, Meng Fang, Mykola Pechenizkiy, Decebal Constantin Mocanu
- Abstract summary: We introduce MaDi, a novel algorithm that learns to mask distractions by the reward signal only.
In MaDi, the conventional actor-critic structure of deep reinforcement learning agents is complemented by a small third sibling, the Masker.
Our algorithm improves the agent's focus with useful masks, while its efficient Masker network only adds 0.2% more parameters to the original structure.
- Score: 40.7452827298478
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The visual world provides an abundance of information, but many input pixels
received by agents often contain distracting stimuli. Autonomous agents need
the ability to distinguish useful information from task-irrelevant perceptions,
enabling them to generalize to unseen environments with new distractions.
Existing works approach this problem using data augmentation or large auxiliary
networks with additional loss functions. We introduce MaDi, a novel algorithm
that learns to mask distractions by the reward signal only. In MaDi, the
conventional actor-critic structure of deep reinforcement learning agents is
complemented by a small third sibling, the Masker. This lightweight neural
network generates a mask to determine what the actor and critic will receive,
such that they can focus on learning the task. The masks are created
dynamically, depending on the current input. We run experiments on the DeepMind
Control Generalization Benchmark, the Distracting Control Suite, and a real UR5
Robotic Arm. Our algorithm improves the agent's focus with useful masks, while
its efficient Masker network only adds 0.2% more parameters to the original
structure, in contrast to previous work. MaDi consistently achieves
generalization results better than or competitive to state-of-the-art methods.
Related papers
- ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders [53.3185750528969]
Masked AutoEncoders (MAE) have emerged as a robust self-supervised framework.
We introduce a data-independent method, termed ColorMAE, which generates different binary mask patterns by filtering random noise.
We demonstrate our strategy's superiority in downstream tasks compared to random masking.
arXiv Detail & Related papers (2024-07-17T22:04:00Z) - Downstream Task Guided Masking Learning in Masked Autoencoders Using
Multi-Level Optimization [42.82742477950748]
Masked Autoencoder (MAE) is a notable method for self-supervised pretraining in visual representation learning.
We introduce the Multi-level Optimized Mask Autoencoder (MLO-MAE), a novel framework that learns an optimal masking strategy during pretraining.
Our experimental findings highlight MLO-MAE's significant advancements in visual representation learning.
arXiv Detail & Related papers (2024-02-28T07:37:26Z) - Masking Improves Contrastive Self-Supervised Learning for ConvNets, and Saliency Tells You Where [63.61248884015162]
We aim to alleviate the burden of including masking operation into the contrastive-learning framework for convolutional neural networks.
We propose to explicitly take the saliency constraint into consideration in which the masked regions are more evenly distributed among the foreground and background.
arXiv Detail & Related papers (2023-09-22T09:58:38Z) - Improving Masked Autoencoders by Learning Where to Mask [65.89510231743692]
Masked image modeling is a promising self-supervised learning method for visual data.
We present AutoMAE, a framework that uses Gumbel-Softmax to interlink an adversarially-trained mask generator and a mask-guided image modeling process.
In our experiments, AutoMAE is shown to provide effective pretraining models on standard self-supervised benchmarks and downstream tasks.
arXiv Detail & Related papers (2023-03-12T05:28:55Z) - Masked Autoencoders are Robust Data Augmentors [90.34825840657774]
Regularization techniques like image augmentation are necessary for deep neural networks to generalize well.
We propose a novel perspective of augmentation to regularize the training process.
We show that utilizing such model-based nonlinear transformation as data augmentation can improve high-level recognition tasks.
arXiv Detail & Related papers (2022-06-10T02:41:48Z) - The Devil is in the Frequency: Geminated Gestalt Autoencoder for
Self-Supervised Visual Pre-Training [13.087987450384036]
We present a new Masked Image Modeling (MIM), termed Geminated Autoencoder (Ge$2$-AE) for visual pre-training.
Specifically, we equip our model with geminated decoders in charge of reconstructing image contents from both pixel and frequency space.
arXiv Detail & Related papers (2022-04-18T09:22:55Z) - Mask or Non-Mask? Robust Face Mask Detector via Triplet-Consistency
Representation Learning [23.062034116854875]
In the absence of vaccines or medicines to stop COVID-19, one of the effective methods to slow the spread of the coronavirus is to wear a face mask.
To mandate the use of face masks or coverings in public areas, additional human resources are required, which is tedious and attention-intensive.
We propose a face mask detection framework that uses the context attention module to enable the effective attention of the feed-forward convolution neural network.
arXiv Detail & Related papers (2021-10-01T16:44:06Z) - OLED: One-Class Learned Encoder-Decoder Network with Adversarial Context
Masking for Novelty Detection [1.933681537640272]
novelty detection is the task of recognizing samples that do not belong to the distribution of the target class.
Deep autoencoders have been widely used as a base of many unsupervised novelty detection methods.
We have designed a framework consisting of two competing networks, a Mask Module and a Reconstructor.
arXiv Detail & Related papers (2021-03-27T17:59:40Z) - Ternary Feature Masks: zero-forgetting for task-incremental learning [68.34518408920661]
We propose an approach without any forgetting to continual learning for the task-aware regime.
By using ternary masks we can upgrade a model to new tasks, reusing knowledge from previous tasks while not forgetting anything about them.
Our method outperforms current state-of-the-art while reducing memory overhead in comparison to weight-based approaches.
arXiv Detail & Related papers (2020-01-23T18:08:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.