Real Evaluations Tractability using Continuous Goal-Directed Actions in
Smart City Applications
- URL: http://arxiv.org/abs/2402.00678v1
- Date: Thu, 1 Feb 2024 15:38:21 GMT
- Title: Real Evaluations Tractability using Continuous Goal-Directed Actions in
Smart City Applications
- Authors: Raul Fernandez-Fernandez, Juan G. Victores, David Estevez, and Carlos
Balaguer
- Abstract summary: Continuous Goal-Directed Actions (CGDA) encodes actions as changes of any feature that can be extracted from the environment.
Current strategies involve performing evaluations in a simulation, transferring the final joint trajectory to the actual robot.
Two different approaches to reduce the number of evaluations using EA, are proposed and compared.
- Score: 3.1158660854608824
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One of the most important challenges of Smart City Applications is to adapt
the system to interact with non-expert users. Robot imitation frameworks aim to
simplify and reduce times of robot programming by allowing users to program
directly through demonstrations. In classical frameworks, actions are modeled
using joint or Cartesian space trajectories. Other features, such as visual
ones, are not always well represented with these pure geometrical approaches.
Continuous Goal-Directed Actions (CGDA) is an alternative to these methods, as
it encodes actions as changes of any feature that can be extracted from the
environment. As a consequence of this, the robot joint trajectories for
execution must be fully computed to comply with this feature-agnostic encoding.
This is achieved using Evolutionary Algorithms (EA), which usually requires too
many evaluations to perform this evolution step in the actual robot. Current
strategies involve performing evaluations in a simulation, transferring the
final joint trajectory to the actual robot. Smart City applications involve
working in highly dynamic and complex environments, where having a precise
model is not always achievable. Our goal is to study the tractability of
performing these evaluations directly in a real-world scenario. Two different
approaches to reduce the number of evaluations using EA, are proposed and
compared. In the first approach, Particle Swarm Optimization (PSO)-based
methods have been studied and compared within CGDA: naive PSO, Fitness
Inheritance PSO (FI-PSO), and Adaptive Fuzzy Fitness Granulation with PSO
(AFFG-PSO). The second approach studied the introduction of geometrical and
velocity constraints within CGDA. The effects of both approaches were analyzed
and compared in the wax and paint actions, two CGDA commonly studied use cases.
Results from this paper depict an important reduction in the number of
evaluations.
Related papers
- Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers.
Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy.
We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z) - Bidirectional Decoding: Improving Action Chunking via Closed-Loop Resampling [51.38330727868982]
Bidirectional Decoding (BID) is a test-time inference algorithm that bridges action chunking with closed-loop operations.
We show that BID boosts the performance of two state-of-the-art generative policies across seven simulation benchmarks and two real-world tasks.
arXiv Detail & Related papers (2024-08-30T15:39:34Z) - REBEL: A Regularization-Based Solution for Reward Overoptimization in Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and user intentions, values, or social norms can be catastrophic in the real world.
Current methods to mitigate this misalignment work by learning reward functions from human preferences.
We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z) - Real-time Trajectory-based Social Group Detection [22.86110112028644]
We propose a simple and efficient framework for social group detection.
Our approach explores the impact of motion trajectory on social grouping and utilizes a novel, reliable, and fast data-driven method.
Our experiments on the popular JRDBAct dataset reveal noticeable improvements in performance, with relative improvements ranging from 2% to 11%.
arXiv Detail & Related papers (2023-04-12T08:01:43Z) - Re-Evaluating LiDAR Scene Flow for Autonomous Driving [80.37947791534985]
Popular benchmarks for self-supervised LiDAR scene flow have unrealistic rates of dynamic motion, unrealistic correspondences, and unrealistic sampling patterns.
We evaluate a suite of top methods on a suite of real-world datasets.
We show that despite the emphasis placed on learning, most performance gains are caused by pre- and post-processing steps.
arXiv Detail & Related papers (2023-04-04T22:45:50Z) - Obstacle Avoidance for Robotic Manipulator in Joint Space via Improved
Proximal Policy Optimization [6.067589886362815]
In this paper, we train a deep neural network via an improved Proximal Policy Optimization (PPO) algorithm to map from task space to joint space for a 6-DoF manipulator.
Since training such a task in real-robot is time-consuming and strenuous, we develop a simulation environment to train the model.
Experimental results showed that using our method, the robot was capable of tracking a single target or reaching multiple targets in unstructured environments.
arXiv Detail & Related papers (2022-10-03T10:21:57Z) - Evolving Pareto-Optimal Actor-Critic Algorithms for Generalizability and
Stability [67.8426046908398]
Generalizability and stability are two key objectives for operating reinforcement learning (RL) agents in the real world.
This paper presents MetaPG, an evolutionary method for automated design of actor-critic loss functions.
arXiv Detail & Related papers (2022-04-08T20:46:16Z) - Benchmarking Deep Reinforcement Learning Algorithms for Vision-based
Robotics [11.225021326001778]
This paper presents a benchmarking study of some of the state-of-the-art reinforcement learning algorithms used for solving two vision-based robotics problems.
The performances of these algorithms are compared against PyBullet's two simulation environments known as KukaDiverseObjectEnv and RacecarZEDGymEnv respectively.
arXiv Detail & Related papers (2022-01-11T22:45:25Z) - Off Environment Evaluation Using Convex Risk Minimization [0.0]
We propose a convex risk minimization algorithm to estimate the model mismatch between the simulator and the target domain.
We show that this estimator can be used along with the simulator to evaluate performance of an RL agents in the target domain.
arXiv Detail & Related papers (2021-12-21T21:31:54Z) - Nonprehensile Riemannian Motion Predictive Control [57.295751294224765]
We introduce a novel Real-to-Sim reward analysis technique to reliably imagine and predict the outcome of taking possible actions for a real robotic platform.
We produce a closed-loop controller to reactively push objects in a continuous action space.
We observe that RMPC is robust in cluttered as well as occluded environments and outperforms the baselines.
arXiv Detail & Related papers (2021-11-15T18:50:04Z) - Robotic Grasp Manipulation Using Evolutionary Computing and Deep
Reinforcement Learning [0.0]
Humans almost immediately know how to manipulate objects for grasping due to learning over the years.
In this paper we have taken up the challenge of developing learning based pose estimation by decomposing the problem into both position and orientation learning.
Based on our proposed architectures and algorithms, the robot is capable of grasping all rigid body objects having regular shapes.
arXiv Detail & Related papers (2020-01-15T17:23:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.