Towards a DSL to Formalize Multimodal Requirements
- URL: http://arxiv.org/abs/2508.14631v1
- Date: Wed, 20 Aug 2025 11:40:33 GMT
- Title: Towards a DSL to Formalize Multimodal Requirements
- Authors: Marcos Gomez-Vazquez, Jordi Cabot,
- Abstract summary: Multimodal systems are becoming increasingly prevalent in software systems, enabled by the advancements in Machine Learning.<n>This triggers the need to easily define the requirements linked to these new types of user interactions, potentially involving more than one modality at the same time.<n>This remains an open challenge due to the lack of languages and methods adapted to the diverse nature of multimodal interactions, with the risk of implementing AI-enhanced systems that do not properly satisfy the user needs.
- Score: 1.961305559606562
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Multimodal systems, which process multiple input types such as text, audio, and images, are becoming increasingly prevalent in software systems, enabled by the huge advancements in Machine Learning. This triggers the need to easily define the requirements linked to these new types of user interactions, potentially involving more than one modality at the same time. This remains an open challenge due to the lack of languages and methods adapted to the diverse nature of multimodal interactions, with the risk of implementing AI-enhanced systems that do not properly satisfy the user needs. In this sense, this paper presents MERLAN, a Domain-Specific Language (DSL) to specify the requirements for these new types of multimodal interfaces. We present the metamodel for such language together with a textual syntax implemented as an ANTLR grammar. A prototype tool enabling requirements engineers to write such requirements and automatically generate a possible implementation of a system compliant with them on top of an agentic framework is also provided.
Related papers
- REprompt: Prompt Generation for Intelligent Software Development Guided by Requirements Engineering [43.0808976544794]
Large language models increasingly function as foundation models within coding agents.<n> prompts play a central role in agent-based intelligent software development.<n>We propose REprompt, a multi-agent prompt optimization framework guided by requirements engineering.
arXiv Detail & Related papers (2026-01-23T07:14:34Z) - Generative Interfaces for Language Models [70.25765232527762]
We propose a paradigm in which large language models (LLMs) respond to user queries by proactively generating user interfaces (UIs)<n>Our framework leverages structured interface-specific representations and iterative refinements to translate user queries into task-specific UIs.<n>Results show that generative interfaces consistently outperform conversational ones, with up to a 72% improvement in human preference.
arXiv Detail & Related papers (2025-08-26T17:43:20Z) - Guided Tensor Lifting [54.10411390218929]
Domain-specific languages (s) for machine learning are revolutionizing the speed and efficiency of machine learning workloads.<n>To take advantage of these capabilities, a user must first translate their legacy code from the language it is currently written in, into the new DSL.<n>Process of automatically lifting code into these DSLs has been identified by several recent works, which propose program synthesis as a solution.
arXiv Detail & Related papers (2025-04-28T12:00:10Z) - A New Paradigm of User-Centric Wireless Communication Driven by Large Language Models [53.16213723669751]
Next generation of wireless communications seeks to deeply integrate artificial intelligence with user-centric communication networks.<n>We propose a novel paradigm for wireless communication that innovatively incorporates the nature language to structured query language.<n>We present a prototype system in which a dynamic semantic representation network at the physical layer adapts its encoding depth to meet user requirements.
arXiv Detail & Related papers (2025-04-16T01:43:36Z) - A Framework to Model ML Engineering Processes [1.9744907811058787]
Development of Machine Learning (ML) based systems is complex and requires multidisciplinary teams with diverse skill sets.
Current process modeling languages are not suitable for describing the development of such systems.
We introduce a framework for modeling ML-based software development processes, built around a domain-specific language.
arXiv Detail & Related papers (2024-04-29T09:17:36Z) - Dialogue-based generation of self-driving simulation scenarios using
Large Language Models [14.86435467709869]
Simulation is an invaluable tool for developing and evaluating controllers for self-driving cars.
Current simulation frameworks are driven by highly-specialist domain specific languages.
There is often a gap between a concise English utterance and the executable code that captures the user's intent.
arXiv Detail & Related papers (2023-10-26T13:07:01Z) - TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wild [102.93338424976959]
We introduce TextBind, an almost annotation-free framework for empowering larger language models with the multi-turn interleaved instruction-following capabilities.
Our approach requires only image-caption pairs and generates multi-turn multimodal instruction-response conversations from a language model.
To accommodate interleaved image-text inputs and outputs, we devise MIM, a language model-centric architecture that seamlessly integrates image encoder and decoder models.
arXiv Detail & Related papers (2023-09-14T15:34:01Z) - PADL: Language-Directed Physics-Based Character Control [66.517142635815]
We present PADL, which allows users to issue natural language commands for specifying high-level tasks and low-level skills that a character should perform.
We show that our framework can be applied to effectively direct a simulated humanoid character to perform a diverse array of complex motor skills.
arXiv Detail & Related papers (2023-01-31T18:59:22Z) - Prompting Is Programming: A Query Language for Large Language Models [5.8010446129208155]
We present the novel idea of Language Model Programming (LMP)
LMP generalizes language model prompting from pure text prompts to an intuitive combination of text prompting and scripting.
We show that LMQL can capture a wide range of state-of-the-art prompting methods in an intuitive way.
arXiv Detail & Related papers (2022-12-12T18:09:09Z) - Technical Report on Neural Language Models and Few-Shot Learning for
Systematic Requirements Processing in MDSE [1.6286277560322266]
This paper is based on the analysis of an open-source set of automotive requirements.
We derive domain-specific language constructs helping us to avoid ambiguities in requirements and increase the level of formality.
arXiv Detail & Related papers (2022-11-16T18:06:25Z) - VX2TEXT: End-to-End Learning of Video-Based Text Generation From
Multimodal Inputs [103.99315770490163]
We present a framework for text generation from multimodal inputs consisting of video plus text, speech, or audio.
Experiments demonstrate that our approach based on a single architecture outperforms the state-of-the-art on three video-based text-generation tasks.
arXiv Detail & Related papers (2021-01-28T15:22:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.