Pretrained Embeddings as a Behavior Specification Mechanism
- URL: http://arxiv.org/abs/2503.02012v2
- Date: Thu, 06 Mar 2025 14:32:23 GMT
- Title: Pretrained Embeddings as a Behavior Specification Mechanism
- Authors: Parv Kapoor, Abigail Hammer, Ashish Kapoor, Karen Leung, Eunsuk Kang,
- Abstract summary: We introduce embeddings as a first-class construct in a specification language.<n>We show that embedding-based specifications can be used to steer a system towards desirable behaviors.
- Score: 17.86560073144763
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We propose an approach to formally specifying the behavioral properties of systems that rely on a perception model for interactions with the physical world. The key idea is to introduce embeddings -- mathematical representations of a real-world concept -- as a first-class construct in a specification language, where properties are expressed in terms of distances between a pair of ideal and observed embeddings. To realize this approach, we propose a new type of temporal logic called Embedding Temporal Logic (ETL), and describe how it can be used to express a wider range of properties about AI-enabled systems than previously possible. We demonstrate the applicability of ETL through a preliminary evaluation involving planning tasks in robots that are driven by foundation models; the results are promising, showing that embedding-based specifications can be used to steer a system towards desirable behaviors.
Related papers
- Sparks of Explainability: Recent Advancements in Explaining Large Vision Models [6.1642231492615345]
This thesis explores advanced approaches to improve explainability in computer vision by analyzing and modeling the features exploited by deep neural networks.<n>It evaluates attribution methods, notably saliency maps, by introducing a metric based on algorithmic stability and an approach utilizing Sobol indices.<n>Two hypotheses are examined: aligning models with human reasoning and adopting a conceptual explainability approach.
arXiv Detail & Related papers (2025-02-03T04:49:32Z) - ECATS: Explainable-by-design concept-based anomaly detection for time series [0.5956301166481089]
We propose ECATS, a concept-based neuro-symbolic architecture where concepts are represented as Signal Temporal Logic (STL) formulae.
We show that our model is able to achieve great classification performance while ensuring local interpretability.
arXiv Detail & Related papers (2024-05-17T08:12:53Z) - Towards a General Framework for Continual Learning with Pre-training [55.88910947643436]
We present a general framework for continual learning of sequentially arrived tasks with the use of pre-training.
We decompose its objective into three hierarchical components, including within-task prediction, task-identity inference, and task-adaptive prediction.
We propose an innovative approach to explicitly optimize these components with parameter-efficient fine-tuning (PEFT) techniques and representation statistics.
arXiv Detail & Related papers (2023-10-21T02:03:38Z) - Compositional Probabilistic and Causal Inference using Tractable Circuit
Models [20.07977560803858]
We introduce md-vtrees, a novel structural formulation of (marginal) determinism in structured decomposable PCs.
We derive the first polytime algorithms for causal inference queries such as backdoor adjustment on PCs.
arXiv Detail & Related papers (2023-04-17T13:48:16Z) - Planning for Learning Object Properties [117.27898922118946]
We formalize the problem of automatically training a neural network to recognize object properties as a symbolic planning problem.
We use planning techniques to produce a strategy for automating the training dataset creation and the learning process.
We provide an experimental evaluation in both a simulated and a real environment.
arXiv Detail & Related papers (2023-01-15T09:37:55Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - Counterfactual Explanations as Interventions in Latent Space [62.997667081978825]
Counterfactual explanations aim to provide to end users a set of features that need to be changed in order to achieve a desired outcome.
Current approaches rarely take into account the feasibility of actions needed to achieve the proposed explanations.
We present Counterfactual Explanations as Interventions in Latent Space (CEILS), a methodology to generate counterfactual explanations.
arXiv Detail & Related papers (2021-06-14T20:48:48Z) - A Diagnostic Study of Explainability Techniques for Text Classification [52.879658637466605]
We develop a list of diagnostic properties for evaluating existing explainability techniques.
We compare the saliency scores assigned by the explainability techniques with human annotations of salient input regions to find relations between a model's performance and the agreement of its rationales with human ones.
arXiv Detail & Related papers (2020-09-25T12:01:53Z) - Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z) - Characterizing an Analogical Concept Memory for Architectures
Implementing the Common Model of Cognition [1.468003557277553]
We propose a new analogical concept memory for Soar that augments its current system of declarative long-term memories.
We demonstrate that the analogical learning methods implemented in the proposed memory can quickly learn a diverse types of novel concepts.
arXiv Detail & Related papers (2020-06-02T21:54:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.