Multi-Faceted Evaluation of Modeling Languages for Augmented Reality Applications -- The Case of ARWFML
- URL: http://arxiv.org/abs/2408.14137v1
- Date: Mon, 26 Aug 2024 09:34:36 GMT
- Title: Multi-Faceted Evaluation of Modeling Languages for Augmented Reality Applications -- The Case of ARWFML
- Authors: Fabian Muff, Hans-Georg Fill,
- Abstract summary: The Augmented Reality Modeling Language (ARWFML) enables the model-based creation of augmented reality scenarios without programming knowledge.
This paper presents two further design iterations for refining the language based on multi-faceted evaluations.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The evaluation of modeling languages for augmented reality applications poses particular challenges due to the three-dimensional environment they target. The previously introduced Augmented Reality Workflow Modeling Language (ARWFML) enables the model-based creation of augmented reality scenarios without programming knowledge. Building upon the first design cycle of the language's specification, this paper presents two further design iterations for refining the language based on multi-faceted evaluations. These include a comparative evaluation of implementation options and workflow capabilities, the introduction of a 3D notation, and the development of a new 3D modeling environment. On this basis, a comprehensibility study of the language was conducted. Thereby, we show how modeling languages for augmented reality can be evolved towards a maturity level suitable for empirical evaluations.
Related papers
- M2AR: A Web-based Modeling Environment for the Augmented Reality Workflow Modeling Language [0.0]
M2AR is a new web-based, two- and three-dimensional modeling environment that enables the modeling and execution of augmented reality applications without requiring programming knowledge.
The platform is based on a 3D JavaScript library and the mixed reality immersive web standard WebXR.
arXiv Detail & Related papers (2024-10-04T07:52:46Z) - LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments [70.91258869156353]
We introduce LangSuitE, a versatile and simulation-free testbed featuring 6 representative embodied tasks in textual embodied worlds.
Compared with previous LLM-based testbeds, LangSuitE offers adaptability to diverse environments without multiple simulation engines.
We devise a novel chain-of-thought (CoT) schema, EmMem, which summarizes embodied states w.r.t. history information.
arXiv Detail & Related papers (2024-06-24T03:36:29Z) - Lessons from the Trenches on Reproducible Evaluation of Language Models [60.522749986793094]
We draw on three years of experience in evaluating large language models to provide guidance and lessons for researchers.
We present the Language Model Evaluation Harness (lm-eval), an open source library for independent, reproducible, and evaluation of language models.
arXiv Detail & Related papers (2024-05-23T16:50:49Z) - Gl\'orIA - A Generative and Open Large Language Model for Portuguese [4.782288068552145]
We introduce Gl'orIA, a robust European Portuguese decoder LLM.
To pre-train Gl'orIA, we assembled a comprehensive PT-PT text corpus comprising 35 billion tokens from various sources.
Evaluation shows that Gl'orIA significantly outperforms existing open PT decoder models in language modeling.
arXiv Detail & Related papers (2024-02-20T12:36:40Z) - Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models [73.40350756742231]
Visually-conditioned language models (VLMs) have seen growing adoption in applications such as visual dialogue, scene understanding, and robotic task planning.
Despite the volume of new releases, key design decisions around image preprocessing, architecture, and optimization are under-explored.
arXiv Detail & Related papers (2024-02-12T18:21:14Z) - L2CEval: Evaluating Language-to-Code Generation Capabilities of Large
Language Models [102.00201523306986]
We present L2CEval, a systematic evaluation of the language-to-code generation capabilities of large language models (LLMs)
We analyze the factors that potentially affect their performance, such as model size, pretraining data, instruction tuning, and different prompting methods.
In addition to assessing model performance, we measure confidence calibration for the models and conduct human evaluations of the output programs.
arXiv Detail & Related papers (2023-09-29T17:57:00Z) - ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented
Visual Models [102.63817106363597]
We build ELEVATER, the first benchmark to compare and evaluate pre-trained language-augmented visual models.
It consists of 20 image classification datasets and 35 object detection datasets, each of which is augmented with external knowledge.
We will release our toolkit and evaluation platforms for the research community.
arXiv Detail & Related papers (2022-04-19T10:23:42Z) - From Natural Language to Simulations: Applying GPT-3 Codex to Automate
Simulation Modeling of Logistics Systems [0.0]
This work is the first attempt to apply Natural Language Processing to automate the development of simulation models of systems vitally important for logistics.
We demonstrated that the framework built on top of the fine-tuned GPT-3 Codex, a Transformer-based language model, could produce functionally valid simulations of queuing and inventory control systems given the verbal description.
arXiv Detail & Related papers (2022-02-24T14:01:50Z) - Pre-Trained Language Models for Interactive Decision-Making [72.77825666035203]
We describe a framework for imitation learning in which goals and observations are represented as a sequence of embeddings.
We demonstrate that this framework enables effective generalization across different environments.
For test tasks involving novel goals or novel scenes, initializing policies with language models improves task completion rates by 43.6%.
arXiv Detail & Related papers (2022-02-03T18:55:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.