Related papers: Asking the Right Questions: Learning Interpretable Action Models Through Query Answering

Asking the Right Questions: Learning Interpretable Action Models Through Query Answering

URL: http://arxiv.org/abs/1912.12613v6
Date: Fri, 9 Apr 2021 16:17:14 GMT
Title: Asking the Right Questions: Learning Interpretable Action Models Through Query Answering
Authors: Pulkit Verma, Shashank Rao Marpally, Siddharth Srivastava
Abstract summary: This paper develops a new approach for estimating an interpretable, relational model of a black-box autonomous agent that can plan and act. Our main contributions are a new paradigm for estimating such models using a minimal query interface with the agent, and a hierarchical querying algorithm that generates an interrogation policy for estimating the agent's internal model.
Score: 33.08099403894141
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper develops a new approach for estimating an interpretable, relational model of a black-box autonomous agent that can plan and act. Our main contributions are a new paradigm for estimating such models using a minimal query interface with the agent, and a hierarchical querying algorithm that generates an interrogation policy for estimating the agent's internal model in a vocabulary provided by the user. Empirical evaluation of our approach shows that despite the intractable search space of possible agent models, our approach allows correct and scalable estimation of interpretable agent models for a wide class of black-box autonomous agents. Our results also show that this approach can use predicate classifiers to learn interpretable models of planning agents that represent states as images.

Related papers

A Method for Evaluating the Interpretability of Machine Learning Models in Predicting Bond Default Risk Based on LIME and SHAP [7.7133862848321835]
This paper uses bond market default prediction as a case study, applying commonly used machine learning algorithms within AI models. The results of this analysis are consistent with the intuitive understanding and logical expectations regarding the interpretability of these models.
arXiv Detail & Related papers (2025-02-26T23:05:34Z)
Interpret the Internal States of Recommendation Model with Sparse Autoencoder [28.234859617081295]
RecSAE is an automated and generalizable probing framework that interprets Recommenders with Sparse AutoEncoder.<n>It extracts interpretable latents from the internal states of recommendation models and links them to semantic concepts for interpretation.<n> RecSAE does not alter original models during interpretation and also enables targeted de-biasing to models based on interpreted results.
arXiv Detail & Related papers (2024-11-09T08:22:31Z)
Selecting Interpretability Techniques for Healthcare Machine Learning models [69.65384453064829]
In healthcare there is a pursuit for employing interpretable algorithms to assist healthcare professionals in several decision scenarios. We overview a selection of eight algorithms, both post-hoc and model-based, that can be used for such purposes.
arXiv Detail & Related papers (2024-06-14T17:49:04Z)
Meaning Representations from Trajectories in Autoregressive Models [106.63181745054571]
We propose to extract meaning representations from autoregressive language models by considering the distribution of all possible trajectories extending an input text. This strategy is prompt-free, does not require fine-tuning, and is applicable to any pre-trained autoregressive model. We empirically show that the representations obtained from large models align well with human annotations, outperform other zero-shot and prompt-free methods on semantic similarity tasks, and can be used to solve more complex entailment and containment tasks that standard embeddings cannot handle.
arXiv Detail & Related papers (2023-10-23T04:35:58Z)
Pseudointelligence: A Unifying Framework for Language Model Evaluation [14.95543156914676]
We propose a complexity-theoretic framework of model evaluation cast as a dynamic interaction between a model and a learned evaluator. We demonstrate that this framework can be used to reason about two case studies in language model evaluation, as well as analyze existing evaluation methods.
arXiv Detail & Related papers (2023-10-18T17:48:05Z)
How to Estimate Model Transferability of Pre-Trained Speech Models? [84.11085139766108]
"Score-based assessment" framework for estimating transferability of pre-trained speech models. We leverage upon two representation theories, Bayesian likelihood estimation and optimal transport, to generate rank scores for the PSM candidates. Our framework efficiently computes transferability scores without actual fine-tuning of candidate models or layers.
arXiv Detail & Related papers (2023-06-01T04:52:26Z)
Differential Assessment of Black-Box AI Agents [29.98710357871698]
We propose a novel approach to differentially assess black-box AI agents that have drifted from their previously known models. We leverage sparse observations of the drifted agent's current behavior and knowledge of its initial model to generate an active querying policy. Empirical evaluation shows that our approach is much more efficient than re-learning the agent model from scratch.
arXiv Detail & Related papers (2022-03-24T17:48:58Z)
Evaluating Bayesian Model Visualisations [0.39845810840390733]
Probabilistic models inform an increasingly broad range of business and policy decisions ultimately made by people. Recent algorithmic, computational, and software framework development progress facilitate the proliferation of Bayesian probabilistic models. While they can empower decision makers to explore complex queries and to perform what-if-style conditioning in theory, suitable visualisations and interactive tools are needed to maximise users' comprehension and rational decision making under uncertainty.
arXiv Detail & Related papers (2022-01-10T19:15:39Z)
Explain, Edit, and Understand: Rethinking User Study Design for Evaluating Model Explanations [97.91630330328815]
We conduct a crowdsourcing study, where participants interact with deception detection models that have been trained to distinguish between genuine and fake hotel reviews. We observe that for a linear bag-of-words model, participants with access to the feature coefficients during training are able to cause a larger reduction in model confidence in the testing phase when compared to the no-explanation control.
arXiv Detail & Related papers (2021-12-17T18:29:56Z)
Validation and Inference of Agent Based Models [0.0]
Agent Based Modelling (ABM) is a computational framework for simulating the behaviours and interactions of autonomous agents. Recent research in ABC has yielded increasingly efficient algorithms for calculating the approximate likelihood. These are investigated and compared using a pedestrian model in the Hamilton CBD.
arXiv Detail & Related papers (2021-07-08T05:53:37Z)
On the model-based stochastic value gradient for continuous reinforcement learning [50.085645237597056]
We show that simple model-based agents can outperform state-of-the-art model-free agents in terms of both sample-efficiency and final reward. Our findings suggest that model-based policy evaluation deserves closer attention.
arXiv Detail & Related papers (2020-08-28T17:58:29Z)
Agent Modelling under Partial Observability for Deep Reinforcement Learning [12.903487594031276]
Existing methods for agent modelling assume knowledge of the local observations and chosen actions of the modelled agents during execution. We learn to extract representations about the modelled agents conditioned only on the local observations of the controlled agent. The representations are used to augment the controlled agent's decision policy which is trained via deep reinforcement learning.
arXiv Detail & Related papers (2020-06-16T18:43:42Z)
Plausible Counterfactuals: Auditing Deep Learning Classifiers with Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data. Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model. Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.