SIERRA: A Modular Framework for Research Automation and Reproducibility
- URL: http://arxiv.org/abs/2208.07805v1
- Date: Tue, 16 Aug 2022 15:36:34 GMT
- Title: SIERRA: A Modular Framework for Research Automation and Reproducibility
- Authors: John Harwell, Maria Gini
- Abstract summary: We present SIERRA, a novel framework for accelerating research development and improving results.
SIERRA accelerates research by automating the process of generating executable experiments from queries over independent variables.
It employs a modular architecture enabling easy customization and extension for the needs of individual researchers.
- Score: 6.1678491628787455
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Modern intelligent systems researchers form hypotheses about system behavior
and then run experiments using one or more independent variables to test their
hypotheses. We present SIERRA, a novel framework structured around that idea
for accelerating research development and improving reproducibility of results.
SIERRA accelerates research by automating the process of generating executable
experiments from queries over independent variables(s), executing experiments,
and processing the results to generate deliverables such as graphs and videos.
It shifts the paradigm for testing hypotheses from procedural ("Do these steps
to answer the query") to declarative ("Here is the query to test--GO!"),
reducing the burden on researchers. It employs a modular architecture enabling
easy customization and extension for the needs of individual researchers,
thereby eliminating manual configuration and processing via throw-away scripts.
SIERRA improves reproducibility of research by providing automation independent
of the execution environment (HPC hardware, real robots, etc.) and targeted
platform (arbitrary simulator or real robots). This enables exact experiment
replication, up to the limit of the execution environment and platform, as well
as making it easy for researchers to test hypotheses in different computational
environments.
Related papers
- MLR-Copilot: Autonomous Machine Learning Research based on Large Language Models Agents [10.86017322488788]
We present a new systematic framework, autonomous Machine Learning Research with large language models (MLR-Copilot)
It is designed to enhance machine learning research productivity through the automatic generation and implementation of research ideas using Large Language Model (LLM) agents.
We evaluate our framework on five machine learning research tasks and the experimental results show the framework's potential to facilitate the research progress and innovations.
arXiv Detail & Related papers (2024-08-26T05:55:48Z) - Automatic benchmarking of large multimodal models via iterative experiment programming [71.78089106671581]
We present APEx, the first framework for automatic benchmarking of LMMs.
Given a research question expressed in natural language, APEx leverages a large language model (LLM) and a library of pre-specified tools to generate a set of experiments for the model at hand.
The report drives the testing procedure: based on the current status of the investigation, APEx chooses which experiments to perform and whether the results are sufficient to draw conclusions.
arXiv Detail & Related papers (2024-06-18T06:43:46Z) - System for systematic literature review using multiple AI agents:
Concept and an empirical evaluation [5.194208843843004]
We introduce a novel multi-AI agent model designed to fully automate the process of conducting Systematic Literature Reviews.
The model operates through a user-friendly interface where researchers input their topic.
It generates a search string used to retrieve relevant academic papers.
The model then autonomously summarizes the abstracts of these papers.
arXiv Detail & Related papers (2024-03-13T10:27:52Z) - MLXP: A Framework for Conducting Replicable Experiments in Python [63.37350735954699]
We propose MLXP, an open-source, simple, and lightweight experiment management tool based on Python.
It streamlines the experimental process with minimal overhead while ensuring a high level of practitioner overhead.
arXiv Detail & Related papers (2024-02-21T14:22:20Z) - A Backend Platform for Supporting the Reproducibility of Computational
Experiments [2.1485350418225244]
It is challenging to recreate the same environment using the same frameworks, code, data sources, programming languages, dependencies, and so on.
In this work, we propose an Integrated Development Environment allowing the share, configuration, packaging and execution of an experiment.
We have been able to successfully reproduce 20 (80%) of these experiments achieving the results reported in such works with minimum effort.
arXiv Detail & Related papers (2023-06-29T10:29:11Z) - SIERRA: A Modular Framework for Research Automation [5.220940151628734]
We present SIERRA, a novel framework for accelerating research developments and improving results.
SIERRA makes it easy to quickly specify the independent variable(s) for an experiment, generate experimental inputs, automatically run the experiment, and process the results to generate deliverables such as graphs and videos.
It employs a deeply modular approach that allows easy customization and extension of automation for the needs of individual researchers.
arXiv Detail & Related papers (2022-03-03T23:45:46Z) - A User's Guide to Calibrating Robotics Simulators [54.85241102329546]
This paper proposes a set of benchmarks and a framework for the study of various algorithms aimed to transfer models and policies learnt in simulation to the real world.
We conduct experiments on a wide range of well known simulated environments to characterize and offer insights into the performance of different algorithms.
Our analysis can be useful for practitioners working in this area and can help make informed choices about the behavior and main properties of sim-to-real algorithms.
arXiv Detail & Related papers (2020-11-17T22:24:26Z) - Learning Discrete Energy-based Models via Auxiliary-variable Local
Exploration [130.89746032163106]
We propose ALOE, a new algorithm for learning conditional and unconditional EBMs for discrete structured data.
We show that the energy function and sampler can be trained efficiently via a new variational form of power iteration.
We present an energy model guided fuzzer for software testing that achieves comparable performance to well engineered fuzzing engines like libfuzzer.
arXiv Detail & Related papers (2020-11-10T19:31:29Z) - Rearrangement: A Challenge for Embodied AI [229.8891614821016]
We describe a framework for research and evaluation in Embodied AI.
Our proposal is based on a canonical task: Rearrangement.
We present experimental testbeds of rearrangement scenarios in four different simulation environments.
arXiv Detail & Related papers (2020-11-03T19:42:32Z) - Integrated Benchmarking and Design for Reproducible and Accessible
Evaluation of Robotic Agents [61.36681529571202]
We describe a new concept for reproducible robotics research that integrates development and benchmarking.
One of the central components of this setup is the Duckietown Autolab, a standardized setup that is itself relatively low-cost and reproducible.
We validate the system by analyzing the repeatability of experiments conducted using the infrastructure and show that there is low variance across different robot hardware and across different remote labs.
arXiv Detail & Related papers (2020-09-09T15:31:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.