Fast and Data Efficient Reinforcement Learning from Pixels via
Non-Parametric Value Approximation
- URL: http://arxiv.org/abs/2203.03078v1
- Date: Mon, 7 Mar 2022 00:31:31 GMT
- Title: Fast and Data Efficient Reinforcement Learning from Pixels via
Non-Parametric Value Approximation
- Authors: Alexander Long, Alan Blair, Herke van Hoof
- Abstract summary: We present Nonparametric Approximation of Inter-Trace returns (NAIT), a Reinforcement Learning algorithm for discrete action, pixel-based environments.
We empirically evaluate NAIT on both the 26 and 57 game variants of ATARI100k where, despite its simplicity, it achieves competitive performance in the online setting with greater than 100x speedup in wall-time.
- Score: 90.78178803486746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present Nonparametric Approximation of Inter-Trace returns (NAIT), a
Reinforcement Learning algorithm for discrete action, pixel-based environments
that is both highly sample and computation efficient. NAIT is a lazy-learning
approach with an update that is equivalent to episodic Monte-Carlo on episode
completion, but that allows the stable incorporation of rewards while an
episode is ongoing. We make use of a fixed domain-agnostic representation,
simple distance based exploration and a proximity graph-based lookup to
facilitate extremely fast execution. We empirically evaluate NAIT on both the
26 and 57 game variants of ATARI100k where, despite its simplicity, it achieves
competitive performance in the online setting with greater than 100x speedup in
wall-time.
Related papers
- Temporal Correlation Meets Embedding: Towards a 2nd Generation of JDE-based Real-Time Multi-Object Tracking [52.04679257903805]
Joint Detection and Embedding (JDE) trackers have demonstrated excellent performance in Multi-Object Tracking (MOT) tasks.
Our tracker, named TCBTrack, achieves state-of-the-art performance on multiple public benchmarks.
arXiv Detail & Related papers (2024-07-19T07:48:45Z) - Exploring Dynamic Transformer for Efficient Object Tracking [58.120191254379854]
We propose DyTrack, a dynamic transformer framework for efficient tracking.
DyTrack automatically learns to configure proper reasoning routes for various inputs, gaining better utilization of the available computational budget.
Experiments on multiple benchmarks demonstrate that DyTrack achieves promising speed-precision trade-offs with only a single model.
arXiv Detail & Related papers (2024-03-26T12:31:58Z) - Diffusion for Natural Image Matting [93.86689168212241]
We present DiffMatte, a solution designed to overcome the challenges of image matting.
First, DiffMatte decouples the decoder from the intricately coupled matting network design, involving only one lightweight decoder in the iterations of the diffusion process.
Second, we employ a self-aligned training strategy with uniform time intervals, ensuring a consistent noise sampling between training and inference across the entire time domain.
arXiv Detail & Related papers (2023-12-10T15:28:56Z) - Unsupervised Visual Representation Learning via Mutual Information
Regularized Assignment [31.00769817116771]
We propose a pseudo-labeling algorithm for unsupervised representation learning inspired by information.
MIRA achieves state-of-the-art performance on various downstream tasks, including the linear/k-NN evaluation and transfer learning.
arXiv Detail & Related papers (2022-11-04T06:49:42Z) - OFedQIT: Communication-Efficient Online Federated Learning via
Quantization and Intermittent Transmission [7.6058140480517356]
Online federated learning (OFL) is a promising framework to collaboratively learn a sequence of non-linear functions (or models) from distributed streaming data.
We propose a communication-efficient OFL algorithm (named OFedQIT) by means of a quantization and an intermittent transmission.
Our analysis reveals that OFedQIT successfully addresses the drawbacks of OFedAvg while maintaining superior learning accuracy.
arXiv Detail & Related papers (2022-05-13T07:46:43Z) - Accelerating Training and Inference of Graph Neural Networks with Fast
Sampling and Pipelining [58.10436813430554]
Mini-batch training of graph neural networks (GNNs) requires a lot of computation and data movement.
We argue in favor of performing mini-batch training with neighborhood sampling in a distributed multi-GPU environment.
We present a sequence of improvements to mitigate these bottlenecks, including a performance-engineered neighborhood sampler.
We also conduct an empirical analysis that supports the use of sampling for inference, showing that test accuracies are not materially compromised.
arXiv Detail & Related papers (2021-10-16T02:41:35Z) - Pyramid Correlation based Deep Hough Voting for Visual Object Tracking [16.080776515556686]
We introduce a voting-based classification-only tracking algorithm named Pyramid Correlation based Deep Hough Voting (short for PCDHV)
Specifically we innovatively construct a Pyramid Correlation module to equip the embedded feature with fine-grained local structures and global spatial contexts.
The elaborately designed Deep Hough Voting module further take over, integrating long-range dependencies of pixels to perceive corners.
arXiv Detail & Related papers (2021-10-15T10:37:00Z) - Topology-Guided Sampling for Fast and Accurate Community Detection [1.0609815608017064]
We present an approach based on topology-guided sampling for accelerating block partitioning.
We also introduce a degree-based thresholding scheme that improves the efficacy of our approach at the expense of speedup.
Our results show that our approach can lead to a speedup of up to 15X over block partitioning without sampling.
arXiv Detail & Related papers (2021-08-15T03:20:10Z) - Sequential Place Learning: Heuristic-Free High-Performance Long-Term
Place Recognition [24.70946979449572]
We develop a learning-based CNN+LSTM architecture, trainable via backpropagation through time, for viewpoint- and appearance-invariant place recognition.
Our model outperforms 15 classical methods while setting new state-of-the-art performance standards.
In addition, we show that SPL can be up to 70x faster to deploy than classical methods on a 729 km route.
arXiv Detail & Related papers (2021-03-02T22:57:43Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.