Lifelong Learning Metrics
- URL: http://arxiv.org/abs/2201.08278v1
- Date: Thu, 20 Jan 2022 16:29:14 GMT
- Title: Lifelong Learning Metrics
- Authors: Alexander New and Megan Baker and Eric Nguyen and Gautam Vallabha
- Abstract summary: The DARPA Lifelong Learning Machines (L2M) program seeks to yield advances in artificial intelligence (AI) systems.
This document outlines a formalism for constructing and characterizing the performance of agents performing lifelong learning scenarios.
- Score: 63.8376359764052
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The DARPA Lifelong Learning Machines (L2M) program seeks to yield advances in
artificial intelligence (AI) systems so that they are capable of learning (and
improving) continuously, leveraging data on one task to improve performance on
another, and doing so in a computationally sustainable way. Performers on this
program developed systems capable of performing a diverse range of functions,
including autonomous driving, real-time strategy, and drone simulation. These
systems featured a diverse range of characteristics (e.g., task structure,
lifetime duration), and an immediate challenge faced by the program's testing
and evaluation team was measuring system performance across these different
settings. This document, developed in close collaboration with DARPA and the
program performers, outlines a formalism for constructing and characterizing
the performance of agents performing lifelong learning scenarios.
Related papers
- A Comparison of Prompt Engineering Techniques for Task Planning and Execution in Service Robotics [16.064583670720587]
We compare prompt engineering techniques and combinations thereof within the application of high-level task planning and execution in service robotics.
We define a diverse set of tasks and a simple set of functionalities in simulation, and measure task completion accuracy and execution time for several state-of-the-art models.
arXiv Detail & Related papers (2024-10-30T13:22:55Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - Empowering Private Tutoring by Chaining Large Language Models [87.76985829144834]
This work explores the development of a full-fledged intelligent tutoring system powered by state-of-the-art large language models (LLMs)
The system is into three inter-connected core processes-interaction, reflection, and reaction.
Each process is implemented by chaining LLM-powered tools along with dynamically updated memory modules.
arXiv Detail & Related papers (2023-09-15T02:42:03Z) - Reinforcement Learning in Robotic Motion Planning by Combined
Experience-based Planning and Self-Imitation Learning [7.919213739992465]
High-quality and representative data is essential for both Imitation Learning (IL)- and Reinforcement Learning (RL)-based motion planning tasks.
We propose self-imitation learning by planning plus (SILP+) algorithm, which embeds experience-based planning into the learning architecture.
Various experimental results show that SILP+ achieves better training efficiency higher and more stable success rate in complex motion planning tasks.
arXiv Detail & Related papers (2023-06-11T19:47:46Z) - A Domain-Agnostic Approach for Characterization of Lifelong Learning
Systems [128.63953314853327]
"Lifelong Learning" systems are capable of 1) Continuous Learning, 2) Transfer and Adaptation, and 3) Scalability.
We show that this suite of metrics can inform the development of varied and complex Lifelong Learning systems.
arXiv Detail & Related papers (2023-01-18T21:58:54Z) - MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale [103.7609761511652]
We show how a large-scale collective robotic learning system can acquire a repertoire of behaviors simultaneously.
New tasks can be continuously instantiated from previously learned tasks.
We train and evaluate our system on a set of 12 real-world tasks with data collected from 7 robots.
arXiv Detail & Related papers (2021-04-16T16:38:02Z) - Guiding Robot Exploration in Reinforcement Learning via Automated
Planning [6.075903612065429]
Reinforcement learning (RL) enables an agent to learn from trial-and-error experiences toward achieving long-term goals.
automated planning aims to compute plans for accomplishing tasks using action knowledge.
We develop Guided Dyna-Q (GDQ) to enable RL agents to reason with action knowledge to avoid exploring less-relevant states.
arXiv Detail & Related papers (2020-04-23T21:03:30Z) - Scalable Multi-Task Imitation Learning with Autonomous Improvement [159.9406205002599]
We build an imitation learning system that can continuously improve through autonomous data collection.
We leverage the robot's own trials as demonstrations for tasks other than the one that the robot actually attempted.
In contrast to prior imitation learning approaches, our method can autonomously collect data with sparse supervision for continuous improvement.
arXiv Detail & Related papers (2020-02-25T18:56:42Z) - Leveraging Rationales to Improve Human Task Performance [15.785125079811902]
Given a computational system's performance exceeds that of its human user, can explainable AI capabilities be leveraged to improve the performance of the human?
We introduce the Rationale-Generating Algorithm, an automated technique for generating rationales for utility-based computational methods.
Results show that our approach produces rationales that lead to statistically significant improvement in human task performance.
arXiv Detail & Related papers (2020-02-11T04:51:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.