Mind the GAP! The Challenges of Scale in Pixel-based Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2505.17749v1
- Date: Fri, 23 May 2025 11:15:43 GMT
- Title: Mind the GAP! The Challenges of Scale in Pixel-based Deep Reinforcement Learning
- Authors: Ghada Sokar, Pablo Samuel Castro,
- Abstract summary: We identify the connection between the output of the encoder and the ensuing dense layers as the main underlying factor limiting scaling capabilities.<n>We present global average pooling as a simple yet effective way of targeting the bottleneck, thereby avoiding the complexity of earlier approaches.
- Score: 20.101971938856153
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scaling deep reinforcement learning in pixel-based environments presents a significant challenge, often resulting in diminished performance. While recent works have proposed algorithmic and architectural approaches to address this, the underlying cause of the performance drop remains unclear. In this paper, we identify the connection between the output of the encoder (a stack of convolutional layers) and the ensuing dense layers as the main underlying factor limiting scaling capabilities; we denote this connection as the bottleneck, and we demonstrate that previous approaches implicitly target this bottleneck. As a result of our analyses, we present global average pooling as a simple yet effective way of targeting the bottleneck, thereby avoiding the complexity of earlier approaches.
Related papers
- Fast Point Cloud Geometry Compression with Context-based Residual Coding and INR-based Refinement [19.575833741231953]
We use the KNN method to determine the neighborhoods of raw surface points.<n>A conditional probability model is adaptive to local geometry, leading to significant rate reduction.<n>We incorporate an implicit neural representation into the refinement layer, allowing the decoder to sample points on the underlying surface at arbitrary densities.
arXiv Detail & Related papers (2024-08-06T05:24:06Z) - Simple Ingredients for Offline Reinforcement Learning [86.1988266277766]
offline reinforcement learning algorithms have proven effective on datasets highly connected to the target downstream task.
We show that existing methods struggle with diverse data: their performance considerably deteriorates as data collected for related but different tasks is simply added to the offline buffer.
We show that scale, more than algorithmic considerations, is the key factor influencing performance.
arXiv Detail & Related papers (2024-03-19T18:57:53Z) - Joint Learning for Scattered Point Cloud Understanding with Hierarchical Self-Distillation [34.26170741722835]
We propose an end-to-end architecture that compensates for and identifies partial point clouds on the fly.<n> hierarchical self-distillation (HSD) can be applied to any hierarchy-based point cloud methods.
arXiv Detail & Related papers (2023-12-28T08:51:04Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Deep Augmentation: Dropout as Augmentation for Self-Supervised Learning [19.495587566796278]
Deep Augmentation is a method that applies dropout or PCA transformations to targeted layers in neural networks.<n>We show that uniformly applying dropout across layers does not consistently improve performance.<n>We also show that a stop-gradient operation is critical for ensuring dropout functions effectively as an augmentation.
arXiv Detail & Related papers (2023-03-25T19:03:57Z) - Stabilizing Off-Policy Deep Reinforcement Learning from Pixels [9.998078491879145]
Off-policy reinforcement learning from pixel observations is notoriously unstable.
We show that these instabilities arise from performing temporal-difference learning with a convolutional encoder and low-magnitude rewards.
We propose A-LIX, a method providing adaptive regularization to the encoder's gradients that explicitly prevents the occurrence of catastrophic self-overfitting.
arXiv Detail & Related papers (2022-07-03T08:52:40Z) - Unsupervised Monocular Depth Learning with Integrated Intrinsics and
Spatio-Temporal Constraints [61.46323213702369]
This work presents an unsupervised learning framework that is able to predict at-scale depth maps and egomotion.
Our results demonstrate strong performance when compared to the current state-of-the-art on multiple sequences of the KITTI driving dataset.
arXiv Detail & Related papers (2020-11-02T22:26:58Z) - LoCo: Local Contrastive Representation Learning [93.98029899866866]
We show that by overlapping local blocks stacking on top of each other, we effectively increase the decoder depth and allow upper blocks to implicitly send feedbacks to lower blocks.
This simple design closes the performance gap between local learning and end-to-end contrastive learning algorithms for the first time.
arXiv Detail & Related papers (2020-08-04T05:41:29Z) - Differentiable Causal Discovery from Interventional Data [141.41931444927184]
We propose a theoretically-grounded method based on neural networks that can leverage interventional data.
We show that our approach compares favorably to the state of the art in a variety of settings.
arXiv Detail & Related papers (2020-07-03T15:19:17Z) - Untangling tradeoffs between recurrence and self-attention in neural
networks [81.30894993852813]
We present a formal analysis of how self-attention affects gradient propagation in recurrent networks.
We prove that it mitigates the problem of vanishing gradients when trying to capture long-term dependencies.
We propose a relevancy screening mechanism that allows for a scalable use of sparse self-attention with recurrence.
arXiv Detail & Related papers (2020-06-16T19:24:25Z) - Action Recognition with Deep Multiple Aggregation Networks [14.696233190562936]
We introduce a novel hierarchical pooling design that captures different levels of temporal granularity in action recognition.
Our design principle is coarse-to-fine and achieved using a tree-structured network.
Besides being principled and well grounded, the proposed hierarchical pooling is also video-length and resolution agnostic.
arXiv Detail & Related papers (2020-06-08T11:37:38Z) - Deep hierarchical pooling design for cross-granularity action
recognition [14.696233190562936]
We introduce a novel hierarchical aggregation design that captures different levels of temporal granularity in action recognition.
Learning the combination of operations in this network -- which best fits a given ground-truth -- is obtained by solving a constrained minimization problem.
Besides being principled and well grounded, the proposed hierarchical pooling is also video-length and resilient to misalignments in actions.
arXiv Detail & Related papers (2020-06-08T11:03:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.