Integrated Benchmarking and Design for Reproducible and Accessible
Evaluation of Robotic Agents
- URL: http://arxiv.org/abs/2009.04362v1
- Date: Wed, 9 Sep 2020 15:31:29 GMT
- Title: Integrated Benchmarking and Design for Reproducible and Accessible
Evaluation of Robotic Agents
- Authors: Jacopo Tani and Andrea F. Daniele and Gianmarco Bernasconi and Amaury
Camus and Aleksandar Petrov and Anthony Courchesne and Bhairav Mehta and
Rohit Suri and Tomasz Zaluska and Matthew R. Walter and Emilio Frazzoli and
Liam Paull and Andrea Censi
- Abstract summary: We describe a new concept for reproducible robotics research that integrates development and benchmarking.
One of the central components of this setup is the Duckietown Autolab, a standardized setup that is itself relatively low-cost and reproducible.
We validate the system by analyzing the repeatability of experiments conducted using the infrastructure and show that there is low variance across different robot hardware and across different remote labs.
- Score: 61.36681529571202
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As robotics matures and increases in complexity, it is more necessary than
ever that robot autonomy research be reproducible. Compared to other sciences,
there are specific challenges to benchmarking autonomy, such as the complexity
of the software stacks, the variability of the hardware and the reliance on
data-driven techniques, amongst others. In this paper, we describe a new
concept for reproducible robotics research that integrates development and
benchmarking, so that reproducibility is obtained "by design" from the
beginning of the research/development processes. We first provide the overall
conceptual objectives to achieve this goal and then a concrete instance that we
have built: the DUCKIENet. One of the central components of this setup is the
Duckietown Autolab, a remotely accessible standardized setup that is itself
also relatively low-cost and reproducible. When evaluating agents, careful
definition of interfaces allows users to choose among local versus remote
evaluation using simulation, logs, or remote automated hardware setups. We
validate the system by analyzing the repeatability of experiments conducted
using the infrastructure and show that there is low variance across different
robot hardware and across different remote labs.
Related papers
- Learning Object Properties Using Robot Proprioception via Differentiable Robot-Object Interaction [52.12746368727368]
Differentiable simulation has become a powerful tool for system identification.
Our approach calibrates object properties by using information from the robot, without relying on data from the object itself.
We demonstrate the effectiveness of our method on a low-cost robotic platform.
arXiv Detail & Related papers (2024-10-04T20:48:38Z) - Tiny Robotics Dataset and Benchmark for Continual Object Detection [6.4036245876073234]
This work introduces a novel benchmark to evaluate the continual learning capabilities of object detection systems in tiny robotic platforms.
Our contributions include: (i) Tiny Robotics Object Detection (TiROD), a comprehensive dataset collected using a small mobile robot, designed to test the adaptability of object detectors across various domains and classes; (ii) an evaluation of state-of-the-art real-time object detectors combined with different continual learning strategies on this dataset; and (iii) we publish the data and the code to replicate the results to foster continuous advancements in this field.
arXiv Detail & Related papers (2024-09-24T16:21:27Z) - Generalized Robot Learning Framework [10.03174544844559]
We present a low-cost robot learning framework that is both easily reproducible and transferable to various robots and environments.
We demonstrate that deployable imitation learning can be successfully applied even to industrial-grade robots.
arXiv Detail & Related papers (2024-09-18T15:34:31Z) - RoboScript: Code Generation for Free-Form Manipulation Tasks across Real
and Simulation [77.41969287400977]
This paper presents textbfRobotScript, a platform for a deployable robot manipulation pipeline powered by code generation.
We also present a benchmark for a code generation benchmark for robot manipulation tasks in free-form natural language.
We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms.
arXiv Detail & Related papers (2024-02-22T15:12:00Z) - Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis [82.59451639072073]
General-purpose robots operate seamlessly in any environment, with any object, and utilize various skills to complete diverse tasks.
As a community, we have been constraining most robotic systems by designing them for specific tasks, training them on specific datasets, and deploying them within specific environments.
Motivated by the impressive open-set performance and content generation capabilities of web-scale, large-capacity pre-trained models, we devote this survey to exploring how foundation models can be applied to general-purpose robotics.
arXiv Detail & Related papers (2023-12-14T10:02:55Z) - SCENEREPLICA: Benchmarking Real-World Robot Manipulation by Creating
Replicable Scenes [5.80109297939618]
We present a new reproducible benchmark for evaluating robot manipulation in the real world, specifically focusing on pick-and-place.
Our benchmark uses the YCB objects, a commonly used dataset in the robotics community, to ensure that our results are comparable to other studies.
arXiv Detail & Related papers (2023-06-27T16:59:15Z) - PACT: Perception-Action Causal Transformer for Autoregressive Robotics
Pre-Training [25.50131893785007]
This work introduces a paradigm for pre-training a general purpose representation that can serve as a starting point for multiple tasks on a given robot.
We present the Perception-Action Causal Transformer (PACT), a generative transformer-based architecture that aims to build representations directly from robot data in a self-supervised fashion.
We show that finetuning small task-specific networks on top of the larger pretrained model results in significantly better performance compared to training a single model from scratch for all tasks simultaneously.
arXiv Detail & Related papers (2022-09-22T16:20:17Z) - SIERRA: A Modular Framework for Research Automation [5.220940151628734]
We present SIERRA, a novel framework for accelerating research developments and improving results.
SIERRA makes it easy to quickly specify the independent variable(s) for an experiment, generate experimental inputs, automatically run the experiment, and process the results to generate deliverables such as graphs and videos.
It employs a deeply modular approach that allows easy customization and extension of automation for the needs of individual researchers.
arXiv Detail & Related papers (2022-03-03T23:45:46Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic
Grasping via Physics-based Metaverse Synthesis [78.26022688167133]
We present a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis.
The proposed dataset contains 100,000 images and 25 different object types.
We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance.
arXiv Detail & Related papers (2021-12-29T17:23:24Z) - Scalable Multi-Task Imitation Learning with Autonomous Improvement [159.9406205002599]
We build an imitation learning system that can continuously improve through autonomous data collection.
We leverage the robot's own trials as demonstrations for tasks other than the one that the robot actually attempted.
In contrast to prior imitation learning approaches, our method can autonomously collect data with sparse supervision for continuous improvement.
arXiv Detail & Related papers (2020-02-25T18:56:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.