For SALE: State-Action Representation Learning for Deep Reinforcement
Learning
- URL: http://arxiv.org/abs/2306.02451v2
- Date: Sun, 5 Nov 2023 16:31:52 GMT
- Title: For SALE: State-Action Representation Learning for Deep Reinforcement
Learning
- Authors: Scott Fujimoto, Wei-Di Chang, Edward J. Smith, Shixiang Shane Gu,
Doina Precup, David Meger
- Abstract summary: SALE is a novel approach for learning embeddings that model the nuanced interaction between state and action.
We integrate SALE and an adaptation of checkpoints for RL into TD3 to form the TD7 algorithm.
On OpenAI gym benchmark tasks, TD7 has an average performance gain of 276.7% and 50.7% over TD3 at 300k and 5M time steps, respectively.
- Score: 60.42044715596703
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the field of reinforcement learning (RL), representation learning is a
proven tool for complex image-based tasks, but is often overlooked for
environments with low-level states, such as physical control problems. This
paper introduces SALE, a novel approach for learning embeddings that model the
nuanced interaction between state and action, enabling effective representation
learning from low-level states. We extensively study the design space of these
embeddings and highlight important design considerations. We integrate SALE and
an adaptation of checkpoints for RL into TD3 to form the TD7 algorithm, which
significantly outperforms existing continuous control algorithms. On OpenAI gym
benchmark tasks, TD7 has an average performance gain of 276.7% and 50.7% over
TD3 at 300k and 5M time steps, respectively, and works in both the online and
offline settings.
Related papers
- The Surprising Effectiveness of Test-Time Training for Abstract Reasoning [64.36534512742736]
We investigate the effectiveness of test-time training (TTT) as a mechanism for improving models' reasoning capabilities.
TTT significantly improves performance on ARC tasks, achieving up to 6x improvement in accuracy compared to base fine-tuned models.
Our findings suggest that explicit symbolic search is not the only path to improved abstract reasoning in neural language models.
arXiv Detail & Related papers (2024-11-11T18:59:45Z) - MENTOR: Mixture-of-Experts Network with Task-Oriented Perturbation for Visual Reinforcement Learning [17.437573206368494]
Visual deep reinforcement learning (RL) enables robots to acquire skills from visual input for unstructured tasks.
Current algorithms suffer from low sample efficiency, limiting their practical applicability.
We present MENTOR, a method that improves both the architecture and optimization of RL agents.
arXiv Detail & Related papers (2024-10-19T04:31:54Z) - DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning [61.10299147201369]
This paper introduces a novel autonomous RL approach, called DigiRL, for training in-the-wild device control agents.
We build a scalable and parallelizable Android learning environment equipped with a VLM-based evaluator.
We demonstrate the effectiveness of DigiRL using the Android-in-the-Wild dataset, where our 1.3B VLM trained with RL achieves a 49.5% absolute improvement.
arXiv Detail & Related papers (2024-06-14T17:49:55Z) - ActiveAnno3D -- An Active Learning Framework for Multi-Modal 3D Object Detection [15.22941019659832]
We propose ActiveAnno3D, an active learning framework to select data samples for labeling.
We perform experiments and ablation studies with BEVFusion and PV-RCNN on the nuScenes and TUM Traffic Intersection dataset.
We integrate our active learning framework into the proAnno labeling tool to enable AI-assisted data selection and labeling.
arXiv Detail & Related papers (2024-02-05T17:52:58Z) - TACO: Temporal Latent Action-Driven Contrastive Loss for Visual Reinforcement Learning [73.53576440536682]
We introduce TACO: Temporal Action-driven Contrastive Learning, a powerful temporal contrastive learning approach.
TACO simultaneously learns a state and an action representation by optimizing the mutual information between representations of current states.
For online RL, TACO achieves 40% performance boost after one million environment interaction steps.
arXiv Detail & Related papers (2023-06-22T22:21:53Z) - Contextualized Spatio-Temporal Contrastive Learning with
Self-Supervision [106.77639982059014]
We present ConST-CL framework to effectively learn-temporally fine-grained representations.
We first design a region-based self-supervised task which requires the model to learn to transform instance representations from one view to another guided by context features.
We then introduce a simple design that effectively reconciles the simultaneous learning of both holistic and local representations.
arXiv Detail & Related papers (2021-12-09T19:13:41Z) - Activation to Saliency: Forming High-Quality Labels for Unsupervised
Salient Object Detection [54.92703325989853]
We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues.
No human annotations are involved in our framework during the whole training process.
Our framework reports significant performance compared with existing USOD methods.
arXiv Detail & Related papers (2021-12-07T11:54:06Z) - SEED: Self-supervised Distillation For Visual Representation [34.63488756535054]
We propose a new learning paradigm, named SElf-SupErvised Distillation (SEED), to transfer its representational knowledge into a smaller architecture (as Student) in a self-supervised fashion.
We show that SEED dramatically boosts the performance of small networks on downstream tasks.
arXiv Detail & Related papers (2021-01-12T20:04:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.