Related papers: Multimodal Pretrained Models for Verifiable Sequential Decision-Making: Planning, Grounding, and Perception

Multimodal Pretrained Models for Verifiable Sequential Decision-Making: Planning, Grounding, and Perception

URL: http://arxiv.org/abs/2308.05295v2
Date: Mon, 17 Jun 2024 19:09:17 GMT
Title: Multimodal Pretrained Models for Verifiable Sequential Decision-Making: Planning, Grounding, and Perception
Authors: Yunhao Yang, Cyrus Neary, Ufuk Topcu,
Abstract summary: Recently developed pretrained models can encode rich world knowledge expressed in multiple modalities, such as text and images. We develop an algorithm that utilizes the knowledge from pretrained models to construct and verify controllers for sequential decision-making tasks. We demonstrate the algorithm's ability to construct, verify, and ground automaton-based controllers through a suite of real-world tasks.
Score: 20.12750360095627
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recently developed pretrained models can encode rich world knowledge expressed in multiple modalities, such as text and images. However, the outputs of these models cannot be integrated into algorithms to solve sequential decision-making tasks. We develop an algorithm that utilizes the knowledge from pretrained models to construct and verify controllers for sequential decision-making tasks, and to ground these controllers to task environments through visual observations with formal guarantees. In particular, the algorithm queries a pretrained model with a user-provided, text-based task description and uses the model's output to construct an automaton-based controller that encodes the model's task-relevant knowledge. It allows formal verification of whether the knowledge encoded in the controller is consistent with other independently available knowledge, which may include abstract information on the environment or user-provided specifications. Next, the algorithm leverages the vision and language capabilities of pretrained models to link the observations from the task environment to the text-based control logic from the controller (e.g., actions and conditions that trigger the actions). We propose a mechanism to provide probabilistic guarantees on whether the controller satisfies the user-provided specifications under perceptual uncertainties. We demonstrate the algorithm's ability to construct, verify, and ground automaton-based controllers through a suite of real-world tasks, including daily life and robot manipulation tasks.

Related papers

Hybrid Reasoning for Perception, Explanation, and Autonomous Action in Manufacturing [0.0]
CIPHER is a vision-language-action (VLA) model framework aiming to replicate human-like reasoning for industrial control.<n>It integrates a process expert, a regression model enabling quantitative characterization of system states.<n>It interprets visual or textual inputs from process monitoring, explains its decisions, and autonomously generates precise machine instructions.
arXiv Detail & Related papers (2025-06-10T05:37:33Z)
Inductive Learning of Robot Task Knowledge from Raw Data and Online Expert Feedback [3.10979520014442]
An increasing level of autonomy of robots poses challenges of trust and social acceptance, especially in human-robot interaction scenarios. This requires an interpretable implementation of robotic cognitive capabilities, possibly based on formal methods as logics for the definition of task specifications. We propose an offline algorithm based on inductive logic programming from noisy examples to extract task specifications.
arXiv Detail & Related papers (2025-01-13T17:25:46Z)
Learning Predictive Checklists with Probabilistic Logic Programming [17.360186431981592]
We propose a novel method for learning predictive checklists from diverse data modalities, such as images and time series. Our approach relies on probabilistic logic programming, a learning paradigm that enables matching the discrete nature of checklist with continuous-valued data. We demonstrate that our method outperforms various explainable machine learning techniques on prediction tasks involving image sequences, time series, and clinical notes.
arXiv Detail & Related papers (2024-11-25T09:07:19Z)
The Foundations of Computational Management: A Systematic Approach to Task Automation for the Integration of Artificial Intelligence into Existing Workflows [55.2480439325792]
This article introduces Computational Management, a systematic approach to task automation. The article offers three easy step-by-step procedures to begin the process of implementing AI within a workflow.
arXiv Detail & Related papers (2024-02-07T01:45:14Z)
Automated Process Planning Based on a Semantic Capability Model and SMT [50.76251195257306]
In research of manufacturing systems and autonomous robots, the term capability is used for a machine-interpretable specification of a system function. We present an approach that combines these two topics: starting from a semantic capability model, an AI planning problem is automatically generated.
arXiv Detail & Related papers (2023-12-14T10:37:34Z)
Code Models are Zero-shot Precondition Reasoners [83.8561159080672]
We use code representations to reason about action preconditions for sequential decision making tasks. We propose a precondition-aware action sampling strategy that ensures actions predicted by a policy are consistent with preconditions.
arXiv Detail & Related papers (2023-11-16T06:19:27Z)
Fine-Tuning Language Models Using Formal Methods Feedback [53.24085794087253]
We present a fully automated approach to fine-tune pre-trained language models for applications in autonomous systems. The method synthesizes automaton-based controllers from pre-trained models guided by natural language task descriptions. The results indicate an improvement in percentage of specifications satisfied by the controller from 60% to 90%.
arXiv Detail & Related papers (2023-10-27T16:24:24Z)
A General Framework for Verification and Control of Dynamical Models via Certificate Synthesis [54.959571890098786]
We provide a framework to encode system specifications and define corresponding certificates. We present an automated approach to formally synthesise controllers and certificates. Our approach contributes to the broad field of safe learning for control, exploiting the flexibility of neural networks.
arXiv Detail & Related papers (2023-09-12T09:37:26Z)
ControlVAE: Model-Based Learning of Generative Controllers for Physics-Based Characters [28.446959320429656]
We introduce ControlVAE, a model-based framework for learning generative motion control policies based on variational autoencoders (VAE) Our framework can learn a rich and flexible latent representation of skills and a skill-conditioned generative control policy from a diverse set of unorganized motion sequences. We demonstrate the effectiveness of ControlVAE using a diverse set of tasks, which allows realistic and interactive control of the simulated characters.
arXiv Detail & Related papers (2022-10-12T10:11:36Z)
Counterfactual Explanation and Causal Inference in Service of Robustness in Robot Control [15.104159722499366]
We propose an architecture for training generative models of counterfactual conditionals of the form, 'can we modify event A to cause B instead of C?' In contrast to conventional control design approaches, where robustness is quantified in terms of the ability to reject noise, we explore the space of counterfactuals that might cause a certain requirement to be violated.
arXiv Detail & Related papers (2020-09-18T14:22:47Z)
Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy. We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space. We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.