Double Q-Learning for Citizen Relocation During Natural Hazards
- URL: http://arxiv.org/abs/2209.03800v2
- Date: Mon, 12 Sep 2022 16:24:24 GMT
- Title: Double Q-Learning for Citizen Relocation During Natural Hazards
- Authors: Alysson Ribeiro da Silva
- Abstract summary: reinforcement learning approaches can be used to deploy a solution where an autonomous robot can save the life of a citizen by itself relocating it, without the need to wait for a rescue team composed of humans.
In this research a solution for citizen relocation based on Partially Observable Markov Decision Processes is adopted.
The performance of the solution was measured as a success rate of a citizen relocation procedure, where the results show that the technique portrays a performance above 100% for easy scenarios and near 50% for hard ones.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Natural disasters can cause substantial negative socio-economic impacts
around the world, due to mortality, relocation, rates, and reconstruction
decisions. Robotics has been successfully applied to identify and rescue
victims during the occurrence of a natural hazard. However, little effort has
been taken to deploy solutions where an autonomous robot can save the life of a
citizen by itself relocating it, without the need to wait for a rescue team
composed of humans. Reinforcement learning approaches can be used to deploy
such a solution, however, one of the most famous algorithms to deploy it, the
Q-learning, suffers from biased results generated when performing its learning
routines. In this research a solution for citizen relocation based on Partially
Observable Markov Decision Processes is adopted, where the capability of the
Double Q-learning in relocating citizens during a natural hazard is evaluated
under a proposed hazard simulation engine based on a grid world. The
performance of the solution was measured as a success rate of a citizen
relocation procedure, where the results show that the technique portrays a
performance above 100% for easy scenarios and near 50% for hard ones.
Related papers
- MEReQ: Max-Ent Residual-Q Inverse RL for Sample-Efficient Alignment from Intervention [81.56607128684723]
We introduce MEReQ (Maximum-Entropy Residual-Q Inverse Reinforcement Learning), designed for sample-efficient alignment from human intervention.
MereQ infers a residual reward function that captures the discrepancy between the human expert's and the prior policy's underlying reward functions.
It then employs Residual Q-Learning (RQL) to align the policy with human preferences using this residual reward function.
arXiv Detail & Related papers (2024-06-24T01:51:09Z) - RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes [57.319845580050924]
We propose a reinforcement learning framework that combines risk-sensitive control with an adaptive action space curriculum.
We show that our algorithm is capable of learning high-speed policies for a real-world off-road driving task.
arXiv Detail & Related papers (2024-05-07T23:32:36Z) - Belief Aided Navigation using Bayesian Reinforcement Learning for Avoiding Humans in Blind Spots [0.0]
This study introduces a novel algorithm, BNBRL+, predicated on the partially observable Markov decision process framework to assess risks in unobservable areas.
It integrates the dynamics between the robot, humans, and inferred beliefs to determine the navigation paths and embeds social norms within the reward function.
The model's ability to navigate effectively in spaces with limited visibility and avoid obstacles dynamically can significantly improve the safety and reliability of autonomous vehicles.
arXiv Detail & Related papers (2024-03-15T08:50:39Z) - A GP-based Robust Motion Planning Framework for Agile Autonomous Robot
Navigation and Recovery in Unknown Environments [6.859965454961918]
We propose a model for proactively detecting the risk of future motion planning failure.
When the risk exceeds a certain threshold, a recovery behavior is triggered.
Our framework is capable of both predicting planner failures and recovering the robot to states where planner success is likely.
arXiv Detail & Related papers (2024-02-02T18:27:21Z) - DiAReL: Reinforcement Learning with Disturbance Awareness for Robust
Sim2Real Policy Transfer in Robot Control [0.0]
Delayed Markov decision processes fulfill the Markov property by augmenting the state space of agents with a finite time window of recently committed actions.
We introduce a disturbance-augmented Markov decision process in delayed settings as a novel representation to incorporate disturbance estimation in training on-policy reinforcement learning algorithms.
arXiv Detail & Related papers (2023-06-15T10:11:38Z) - You Only Live Once: Single-Life Reinforcement Learning [124.1738675154651]
In many real-world situations, the goal might not be to learn a policy that can do the task repeatedly, but simply to perform a new task successfully once in a single trial.
We formalize this problem setting, where an agent must complete a task within a single episode without interventions.
We propose an algorithm, $Q$-weighted adversarial learning (QWALE), which employs a distribution matching strategy.
arXiv Detail & Related papers (2022-10-17T09:00:11Z) - Don't Start From Scratch: Leveraging Prior Data to Automate Robotic
Reinforcement Learning [70.70104870417784]
Reinforcement learning (RL) algorithms hold the promise of enabling autonomous skill acquisition for robotic systems.
In practice, real-world robotic RL typically requires time consuming data collection and frequent human intervention to reset the environment.
In this work, we study how these challenges can be tackled by effective utilization of diverse offline datasets collected from previously seen tasks.
arXiv Detail & Related papers (2022-07-11T08:31:22Z) - Persistent Reinforcement Learning via Subgoal Curricula [114.83989499740193]
Value-accelerated Persistent Reinforcement Learning (VaPRL) generates a curriculum of initial states.
VaPRL reduces the interventions required by three orders of magnitude compared to episodic reinforcement learning.
arXiv Detail & Related papers (2021-07-27T16:39:45Z) - Zero-Shot Reinforcement Learning on Graphs for Autonomous Exploration
Under Uncertainty [6.42522897323111]
We present a framework for self-learning a high-performance exploration policy in a single simulation environment.
We propose a novel approach that uses graph neural networks in conjunction with deep reinforcement learning.
arXiv Detail & Related papers (2021-05-11T02:42:17Z) - Text Analytics for Resilience-Enabled Extreme Events Reconnaissance [7.54569938687922]
The study focuses on (1) automated data (news and social media) collection hosted by the Pacific Earthquake Engineering Research (PEER) Center server, (2) automatic generation of reconnaissance reports, and (3) use of social media to extract post-hazard information such as the recovery time.
arXiv Detail & Related papers (2020-11-26T01:43:29Z) - Cautious Adaptation For Reinforcement Learning in Safety-Critical
Settings [129.80279257258098]
Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous.
We propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments.
We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk.
arXiv Detail & Related papers (2020-08-15T01:40:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.