ARTiST: Automated Text Simplification for Task Guidance in Augmented
Reality
- URL: http://arxiv.org/abs/2402.18797v1
- Date: Thu, 29 Feb 2024 01:58:49 GMT
- Title: ARTiST: Automated Text Simplification for Task Guidance in Augmented
Reality
- Authors: Guande Wu, Jing Qian, Sonia Castelo, Shaoyu Chen, Joao Rulff, Claudio
Silva
- Abstract summary: ARTiST is an automatic text simplification system that uses a few-shot prompt and GPT-3 models to optimize the text length and semantic content for augmented reality.
Results from a 16-user empirical study showed that ARTiST lightens the cognitive load and improves performance significantly over both unmodified text and text modified via traditional methods.
- Score: 11.23591724305816
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text presented in augmented reality provides in-situ, real-time information
for users. However, this content can be challenging to apprehend quickly when
engaging in cognitively demanding AR tasks, especially when it is presented on
a head-mounted display. We propose ARTiST, an automatic text simplification
system that uses a few-shot prompt and GPT-3 models to specifically optimize
the text length and semantic content for augmented reality. Developed out of a
formative study that included seven users and three experts, our system
combines a customized error calibration model with a few-shot prompt to
integrate the syntactic, lexical, elaborative, and content simplification
techniques, and generate simplified AR text for head-worn displays. Results
from a 16-user empirical study showed that ARTiST lightens the cognitive load
and improves performance significantly over both unmodified text and text
modified via traditional methods. Our work constitutes a step towards
automating the optimization of batch text data for readability and performance
in augmented reality.
Related papers
- An efficient text augmentation approach for contextualized Mandarin speech recognition [4.600045052545344]
Our study proposes to leverage extensive text-only datasets and contextualize pre-trained ASR models.
To contextualize a pre-trained CIF-based ASR, we construct a codebook using limited speech-text data.
Our experiments on diverse Mandarin test sets demonstrate that our TA approach significantly boosts recognition performance.
arXiv Detail & Related papers (2024-06-14T11:53:14Z) - StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond [68.0107158115377]
We have crafted an efficient vision-language model, StrucTexTv3, tailored to tackle various intelligent tasks for text-rich images.
We enhance the perception and comprehension abilities of StrucTexTv3 through instruction learning.
Our method achieved SOTA results in text-rich image perception tasks, and significantly improved performance in comprehension tasks.
arXiv Detail & Related papers (2024-05-31T16:55:04Z) - Semantic-aware Video Representation for Few-shot Action Recognition [1.6486717871944268]
We propose a simple yet effective Semantic-Aware Few-Shot Action Recognition (SAFSAR) model to address these issues.
We show that directly leveraging a 3D feature extractor combined with an effective feature-fusion scheme, and a simple cosine similarity for classification can yield better performance.
Experiments on five challenging few-shot action recognition benchmarks under various settings demonstrate that the proposed SAFSAR model significantly improves the state-of-the-art performance.
arXiv Detail & Related papers (2023-11-10T18:13:24Z) - ARTIST: ARTificial Intelligence for Simplified Text [5.095775294664102]
Text Simplification is a key Natural Language Processing task that aims for reducing the linguistic complexity of a text.
Recent advances in Generative Artificial Intelligence (AI) have enabled automatic text simplification both on the lexical and syntactical levels.
arXiv Detail & Related papers (2023-08-25T16:06:06Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic
Textual Guidance [70.08635216710967]
X-Mesh is a text-driven 3D stylization framework that incorporates a novel Text-guided Dynamic Attention Module.
We introduce a new standard text-mesh benchmark, MIT-30, and two automated metrics, which will enable future research to achieve fair and objective comparisons.
arXiv Detail & Related papers (2023-03-28T06:45:31Z) - Informative Text Generation from Knowledge Triples [56.939571343797304]
We propose a novel memory augmented generator that employs a memory network to memorize the useful knowledge learned during the training.
We derive a dataset from WebNLG for our new setting and conduct extensive experiments to investigate the effectiveness of our model.
arXiv Detail & Related papers (2022-09-26T14:35:57Z) - Vision-Language Pre-Training for Boosting Scene Text Detectors [57.08046351495244]
We specifically adapt vision-language joint learning for scene text detection.
We propose to learn contextualized, joint representations through vision-language pre-training.
The pre-trained model is able to produce more informative representations with richer semantics.
arXiv Detail & Related papers (2022-04-29T03:53:54Z) - Enhanced Modality Transition for Image Captioning [51.72997126838352]
We build a Modality Transition Module (MTM) to transfer visual features into semantic representations before forwarding them to the language model.
During the training phase, the modality transition network is optimised by the proposed modality loss.
Experiments have been conducted on the MS-COCO dataset demonstrating the effectiveness of the proposed framework.
arXiv Detail & Related papers (2021-02-23T07:20:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.