Differential Assessment of Black-Box AI Agents
- URL: http://arxiv.org/abs/2203.13236v1
- Date: Thu, 24 Mar 2022 17:48:58 GMT
- Title: Differential Assessment of Black-Box AI Agents
- Authors: Rashmeet Kaur Nayyar, Pulkit Verma, Siddharth Srivastava
- Abstract summary: We propose a novel approach to differentially assess black-box AI agents that have drifted from their previously known models.
We leverage sparse observations of the drifted agent's current behavior and knowledge of its initial model to generate an active querying policy.
Empirical evaluation shows that our approach is much more efficient than re-learning the agent model from scratch.
- Score: 29.98710357871698
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Much of the research on learning symbolic models of AI agents focuses on
agents with stationary models. This assumption fails to hold in settings where
the agent's capabilities may change as a result of learning, adaptation, or
other post-deployment modifications. Efficient assessment of agents in such
settings is critical for learning the true capabilities of an AI system and for
ensuring its safe usage. In this work, we propose a novel approach to
differentially assess black-box AI agents that have drifted from their
previously known models. As a starting point, we consider the fully observable
and deterministic setting. We leverage sparse observations of the drifted
agent's current behavior and knowledge of its initial model to generate an
active querying policy that selectively queries the agent and computes an
updated model of its functionality. Empirical evaluation shows that our
approach is much more efficient than re-learning the agent model from scratch.
We also show that the cost of differential assessment using our method is
proportional to the amount of drift in the agent's functionality.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.