PROMISE: A Framework for Developing Complex Conversational Interactions (Technical Report)
- URL: http://arxiv.org/abs/2312.03699v3
- Date: Mon, 8 Apr 2024 13:32:31 GMT
- Title: PROMISE: A Framework for Developing Complex Conversational Interactions (Technical Report)
- Authors: Wenyuan Wu, Jasmin Heierli, Max Meisterhans, Adrian Moser, Andri Färber, Mateusz Dolata, Elena Gavagnin, Alexandre de Spindler, Gerhard Schwabe,
- Abstract summary: We present PROMISE, a framework that facilitates the development of complex language-based interactions with information systems.
We show the benefits of PROMISE in the context of application scenarios within health information systems and demonstrate its ability to handle complex interactions.
- Score: 33.7054351451505
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The advent of increasingly powerful language models has raised expectations for language-based interactions. However, controlling these models is a challenge, emphasizing the need to be able to investigate the feasibility and value of their application. We present PROMISE, a framework that facilitates the development of complex language-based interactions with information systems. Its use of state machine modeling concepts enables model-driven, dynamic prompt orchestration across hierarchically nested states and transitions. This improves the control of the behavior of language models and thus enables their effective and efficient use. In this technical report we show the benefits of PROMISE in the context of application scenarios within health information systems and demonstrate its ability to handle complex interactions. We also include code examples and present default user interfaces available as part of PROMISE.
Related papers
- LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments [70.91258869156353]
We introduce LangSuitE, a versatile and simulation-free testbed featuring 6 representative embodied tasks in textual embodied worlds.
Compared with previous LLM-based testbeds, LangSuitE offers adaptability to diverse environments without multiple simulation engines.
We devise a novel chain-of-thought (CoT) schema, EmMem, which summarizes embodied states w.r.t. history information.
arXiv Detail & Related papers (2024-06-24T03:36:29Z) - A Framework to Model ML Engineering Processes [1.9744907811058787]
Development of Machine Learning (ML) based systems is complex and requires multidisciplinary teams with diverse skill sets.
Current process modeling languages are not suitable for describing the development of such systems.
We introduce a framework for modeling ML-based software development processes, built around a domain-specific language.
arXiv Detail & Related papers (2024-04-29T09:17:36Z) - LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models [50.259006481656094]
We present a novel interactive application aimed towards understanding the internal mechanisms of large vision-language models.
Our interface is designed to enhance the interpretability of the image patches, which are instrumental in generating an answer.
We present a case study of how our application can aid in understanding failure mechanisms in a popular large multi-modal model: LLaVA.
arXiv Detail & Related papers (2024-04-03T23:57:34Z) - Large Language User Interfaces: Voice Interactive User Interfaces powered by LLMs [5.06113628525842]
We present a framework that can serve as an intermediary between a user and their user interface (UI)
We employ a system that stands upon textual semantic mappings of UI components, in the form of annotations.
Our engine can classify the most appropriate application, extract relevant parameters, and subsequently execute precise predictions of the user's expected actions.
arXiv Detail & Related papers (2024-02-07T21:08:49Z) - MEIA: Multimodal Embodied Perception and Interaction in Unknown Environments [82.67236400004826]
We introduce the Multimodal Embodied Interactive Agent (MEIA), capable of translating high-level tasks expressed in natural language into a sequence of executable actions.
MEM module enables MEIA to generate executable action plans based on diverse requirements and the robot's capabilities.
arXiv Detail & Related papers (2024-02-01T02:43:20Z) - Towards More Unified In-context Visual Understanding [74.55332581979292]
We present a new ICL framework for visual understanding with multi-modal output enabled.
First, we quantize and embed both text and visual prompt into a unified representational space.
Then a decoder-only sparse transformer architecture is employed to perform generative modeling on them.
arXiv Detail & Related papers (2023-12-05T06:02:21Z) - Prompt-to-OS (P2OS): Revolutionizing Operating Systems and
Human-Computer Interaction with Integrated AI Generative Models [10.892991111926573]
We present a paradigm for human-computer interaction that revolutionizes the traditional notion of an operating system.
Within this innovative framework, user requests issued to the machine are handled by an interconnected ecosystem of generative AI models.
This visionary concept raises significant challenges, including privacy, security, trustability, and the ethical use of generative models.
arXiv Detail & Related papers (2023-10-07T17:16:34Z) - 'What are you referring to?' Evaluating the Ability of Multi-Modal
Dialogue Models to Process Clarificational Exchanges [65.03196674816772]
Referential ambiguities arise in dialogue when a referring expression does not uniquely identify the intended referent for the addressee.
Addressees usually detect such ambiguities immediately and work with the speaker to repair it using meta-communicative, Clarification Exchanges (CE): a Clarification Request (CR) and a response.
Here, we argue that the ability to generate and respond to CRs imposes specific constraints on the architecture and objective functions of multi-modal, visually grounded dialogue models.
arXiv Detail & Related papers (2023-07-28T13:44:33Z) - Interactive Natural Language Processing [67.87925315773924]
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP.
This paper offers a comprehensive survey of iNLP, starting by proposing a unified definition and framework of the concept.
arXiv Detail & Related papers (2023-05-22T17:18:29Z) - Decoupled Context Processing for Context Augmented Language Modeling [33.89636308731306]
Language models can be augmented with a context retriever to incorporate knowledge from large external databases.
By leveraging retrieved context, the neural network does not have to memorize the massive amount of world knowledge within its internal parameters, leading to better efficiency, interpretability and modularity.
arXiv Detail & Related papers (2022-10-11T20:05:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.