Learning mirror maps in policy mirror descent
- URL: http://arxiv.org/abs/2402.05187v2
- Date: Fri, 7 Jun 2024 16:08:42 GMT
- Title: Learning mirror maps in policy mirror descent
- Authors: Carlo Alfano, Sebastian Towers, Silvia Sapora, Chris Lu, Patrick Rebeschini,
- Abstract summary: Policy Mirror Descent (PMD) is a popular framework in reinforcement learning.
Despite its popularity, the exploration of PMD's full potential is limited.
We show that it is possible to learn a mirror map that outperforms the negative entropy in more complex environments.
- Score: 12.792602427704391
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Policy Mirror Descent (PMD) is a popular framework in reinforcement learning, serving as a unifying perspective that encompasses numerous algorithms. These algorithms are derived through the selection of a mirror map and enjoy finite-time convergence guarantees. Despite its popularity, the exploration of PMD's full potential is limited, with the majority of research focusing on a particular mirror map -- namely, the negative entropy -- which gives rise to the renowned Natural Policy Gradient (NPG) method. It remains uncertain from existing theoretical studies whether the choice of mirror map significantly influences PMD's efficacy. In our work, we conduct empirical investigations to show that the conventional mirror map choice (NPG) often yields less-than-optimal outcomes across several standard benchmark environments. Using evolutionary strategies, we identify more efficient mirror maps that enhance the performance of PMD. We first focus on a tabular environment, i.e. Grid-World, where we relate existing theoretical bounds with the performance of PMD for a few standard mirror maps and the learned one. We then show that it is possible to learn a mirror map that outperforms the negative entropy in more complex environments, such as the MinAtar suite. Our results suggest that mirror maps generalize well across various environments, raising questions about how to best match a mirror map to an environment's structure and characteristics.
Related papers
- MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections [58.003014868772254]
MirrorGaussian is the first method for mirror scene reconstruction with real-time rendering based on 3D Gaussian Splatting.
We introduce an intuitive dual-rendering strategy that enables differentiableization of both the real-world 3D Gaussians and the mirrored counterpart.
Our approach significantly outperforms existing methods, achieving state-of-the-art results.
arXiv Detail & Related papers (2024-05-20T09:58:03Z) - Efficient Mirror Detection via Multi-level Heterogeneous Learning [39.091162729266294]
HetNet is a highly efficient mirror detection network.
HetNet follows an effective architecture that obtains specific information at different stages to detect mirrors.
Compared to the state-of-the-art method, HetNet runs 664$%$ faster and draws an average performance gain of 8.9$%$ on MAE, 3.1$%$ on IoU, and 2.0$%$ on F-measure.
arXiv Detail & Related papers (2022-11-28T18:51:11Z) - Symmetry-Aware Transformer-based Mirror Detection [85.47570468668955]
We propose a dual-path Symmetry-Aware Transformer-based mirror detection Network (SATNet)
SATNet includes two novel modules: Symmetry-Aware Attention Module (SAAM) and Contrast and Fusion Decoder Module (CFDM)
Experimental results show that SATNet outperforms both RGB and RGB-D mirror detection methods on all available mirror detection datasets.
arXiv Detail & Related papers (2022-07-13T16:40:01Z) - Mirror-Yolo: A Novel Attention Focus, Instance Segmentation and Mirror Detection Model [6.048747739825864]
YOLOv4 achieves phenomenal results in terms of object detection accuracy and speed, but it still fails in detecting mirrors.
We propose Mirror-YOLO, which targets mirror detection, containing a novel attention focus mechanism for features acquisition.
arXiv Detail & Related papers (2022-02-17T08:03:48Z) - Unpaired Image Super-Resolution with Optimal Transport Maps [128.1189695209663]
Real-world image super-resolution (SR) tasks often do not have paired datasets limiting the application of supervised techniques.
We propose an algorithm for unpaired SR which learns an unbiased OT map for the perceptual transport cost.
Our algorithm provides nearly state-of-the-art performance on the large-scale unpaired AIM-19 dataset.
arXiv Detail & Related papers (2022-02-02T16:21:20Z) - CAMERAS: Enhanced Resolution And Sanity preserving Class Activation
Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.
We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z) - Efficient LiDAR Odometry for Autonomous Driving [16.22522474028277]
LiDAR odometry plays an important role in self-localization and mapping for autonomous navigation.
Recent spherical range image-based method enjoys the merits of fast nearest neighbor search by spherical mapping.
We propose a novel efficient LiDAR odometry approach by taking advantage of both non-ground spherical range image and bird's-eye-view map for ground points.
arXiv Detail & Related papers (2021-04-22T06:05:09Z) - Two-Stage Single Image Reflection Removal with Reflection-Aware Guidance [78.34235841168031]
We present a novel two-stage network with reflection-aware guidance (RAGNet) for single image reflection removal (SIRR)
RAG can be used (i) to mitigate the effect of reflection from the observation, and (ii) to generate mask in partial convolution for mitigating the effect of deviating from linear combination hypothesis.
Experiments on five commonly used datasets demonstrate the quantitative and qualitative superiority of our RAGNet in comparison to the state-of-the-art SIRR methods.
arXiv Detail & Related papers (2020-12-02T03:14:57Z) - Adaptive confidence thresholding for monocular depth estimation [83.06265443599521]
We propose a new approach to leverage pseudo ground truth depth maps of stereo images generated from self-supervised stereo matching methods.
The confidence map of the pseudo ground truth depth map is estimated to mitigate performance degeneration by inaccurate pseudo depth maps.
Experimental results demonstrate superior performance to state-of-the-art monocular depth estimation methods.
arXiv Detail & Related papers (2020-09-27T13:26:16Z) - DeepFactors: Real-Time Probabilistic Dense Monocular SLAM [29.033778410908877]
We present a SLAM system that unifies methods in a probabilistic framework while still maintaining real-time performance.
This is achieved through the use of a learned compact depth map representation and reformulating three different types of errors.
We evaluate our system on trajectory estimation and depth reconstruction on real-world sequences and present various examples of estimated dense geometry.
arXiv Detail & Related papers (2020-01-14T21:08:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.