RLSAC: Reinforcement Learning enhanced Sample Consensus for End-to-End
Robust Estimation
- URL: http://arxiv.org/abs/2308.05318v1
- Date: Thu, 10 Aug 2023 03:14:19 GMT
- Title: RLSAC: Reinforcement Learning enhanced Sample Consensus for End-to-End
Robust Estimation
- Authors: Chang Nie, Guangming Wang, Zhe Liu, Luca Cavalli, Marc Pollefeys,
Hesheng Wang
- Abstract summary: We propose RLSAC, a novel Reinforcement Learning enhanced SAmple Consensus framework for end-to-end robust estimation.
RLSAC employs a graph neural network to utilize both data and memory features to guide exploring directions for sampling the next minimum set.
Our experimental results demonstrate that RLSAC can learn from features to gradually explore a better hypothesis.
- Score: 74.47709320443998
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Robust estimation is a crucial and still challenging task, which involves
estimating model parameters in noisy environments. Although conventional
sampling consensus-based algorithms sample several times to achieve robustness,
these algorithms cannot use data features and historical information
effectively. In this paper, we propose RLSAC, a novel Reinforcement Learning
enhanced SAmple Consensus framework for end-to-end robust estimation. RLSAC
employs a graph neural network to utilize both data and memory features to
guide exploring directions for sampling the next minimum set. The feedback of
downstream tasks serves as the reward for unsupervised training. Therefore,
RLSAC can avoid differentiating to learn the features and the feedback of
downstream tasks for end-to-end robust estimation. In addition, RLSAC
integrates a state transition module that encodes both data and memory
features. Our experimental results demonstrate that RLSAC can learn from
features to gradually explore a better hypothesis. Through analysis, it is
apparent that RLSAC can be easily transferred to other sampling consensus-based
robust estimation tasks. To the best of our knowledge, RLSAC is also the first
method that uses reinforcement learning to sample consensus for end-to-end
robust estimation. We release our codes at https://github.com/IRMVLab/RLSAC.
Related papers
- Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation [37.36913210031282]
Preference-based reinforcement learning (PbRL) has shown impressive capabilities in training agents without reward engineering.
We propose SEER, an efficient PbRL method that integrates label smoothing and policy regularization techniques.
arXiv Detail & Related papers (2024-05-29T01:49:20Z) - Take the Bull by the Horns: Hard Sample-Reweighted Continual Training
Improves LLM Generalization [165.98557106089777]
A key challenge is to enhance the capabilities of large language models (LLMs) amid a looming shortage of high-quality training data.
Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets.
We then formalize this strategy into a principled framework of Instance-Reweighted Distributionally Robust Optimization.
arXiv Detail & Related papers (2024-02-22T04:10:57Z) - Reinforcement Replaces Supervision: Query focused Summarization using
Deep Reinforcement Learning [43.123290672073814]
We deal with systems that generate summaries from document(s) based on a query.
Motivated by the insight that Reinforcement Learning (RL) provides a generalization to Supervised Learning (SL) for Natural Language Generation, we use an RL-based approach for this task.
We develop multiple Policy Gradient networks, trained on various reward signals: ROUGE, BLEU, and Semantic Similarity.
arXiv Detail & Related papers (2023-11-29T10:38:16Z) - Soft Random Sampling: A Theoretical and Empirical Analysis [59.719035355483875]
Soft random sampling (SRS) is a simple yet effective approach for efficient deep neural networks when dealing with massive data.
It selects a uniformly speed at random with replacement from each data set in each epoch.
It is shown to be a powerful and competitive strategy with significant and competitive performance on real-world industrial scale.
arXiv Detail & Related papers (2023-11-21T17:03:21Z) - Provable Reward-Agnostic Preference-Based Reinforcement Learning [61.39541986848391]
Preference-based Reinforcement Learning (PbRL) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories.
We propose a theoretical reward-agnostic PbRL framework where exploratory trajectories that enable accurate learning of hidden reward functions are acquired.
arXiv Detail & Related papers (2023-05-29T15:00:09Z) - Generalized Differentiable RANSAC [95.95627475224231]
$nabla$-RANSAC is a differentiable RANSAC that allows learning the entire randomized robust estimation pipeline.
$nabla$-RANSAC is superior to the state-of-the-art in terms of accuracy while running at a similar speed to its less accurate alternatives.
arXiv Detail & Related papers (2022-12-26T15:13:13Z) - Deep Reinforcement Learning-based UAV Navigation and Control: A Soft
Actor-Critic with Hindsight Experience Replay Approach [0.9137554315375919]
We propose SACHER (soft actor-critic (SAC) with hindsight experience replay (HER)) as a class of deep reinforcement learning (DRL) algorithms.
We show that SACHER achieves the desired optimal outcomes faster and more accurately than SAC, since HER improves the sample efficiency of SAC.
We apply SACHER to the navigation and control problem of unmanned aerial vehicles (UAVs), where SACHER generates the optimal navigation path.
arXiv Detail & Related papers (2021-06-02T08:30:14Z) - Predictive Information Accelerates Learning in RL [50.52439807008805]
We train Soft Actor-Critic (SAC) agents from pixels with an auxiliary task that learns a compressed representation of the predictive information of the RL environment dynamics.
We show that PI-SAC agents can substantially improve sample efficiency over challenging baselines on tasks from the DM Control suite of continuous control environments.
arXiv Detail & Related papers (2020-07-24T08:14:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.