Accessible Visualization via Natural Language Descriptions: A Four-Level
Model of Semantic Content
- URL: http://arxiv.org/abs/2110.04406v1
- Date: Fri, 8 Oct 2021 23:37:25 GMT
- Title: Accessible Visualization via Natural Language Descriptions: A Four-Level
Model of Semantic Content
- Authors: Alan Lundgard and Arvind Satyanarayan
- Abstract summary: We introduce a conceptual model for the semantic content conveyed by natural language descriptions of visualizations.
We conduct a mixed-methods evaluation with 30 blind and 90 sighted readers, and find that these reader groups differ significantly on which semantic content they rank as most useful.
- Score: 6.434361163743876
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Natural language descriptions sometimes accompany visualizations to better
communicate and contextualize their insights, and to improve their
accessibility for readers with disabilities. However, it is difficult to
evaluate the usefulness of these descriptions, and how effectively they improve
access to meaningful information, because we have little understanding of the
semantic content they convey, and how different readers receive this content.
In response, we introduce a conceptual model for the semantic content conveyed
by natural language descriptions of visualizations. Developed through a
grounded theory analysis of 2,147 sentences, our model spans four levels of
semantic content: enumerating visualization construction properties (e.g.,
marks and encodings); reporting statistical concepts and relations (e.g.,
extrema and correlations); identifying perceptual and cognitive phenomena
(e.g., complex trends and patterns); and elucidating domain-specific insights
(e.g., social and political context). To demonstrate how our model can be
applied to evaluate the effectiveness of visualization descriptions, we conduct
a mixed-methods evaluation with 30 blind and 90 sighted readers, and find that
these reader groups differ significantly on which semantic content they rank as
most useful. Together, our model and findings suggest that access to meaningful
information is strongly reader-specific, and that research in automatic
visualization captioning should orient toward descriptions that more richly
communicate overall trends and statistics, sensitive to reader preferences. Our
work further opens a space of research on natural language as a data interface
coequal with visualization.
Related papers
- Evaluating Attribute Comprehension in Large Vision-Language Models [18.513510568037624]
We evaluate the attribute comprehension ability of large vision-language models from two perspectives: attribute recognition and attribute hierarchy understanding.
We introduce three main findings: (1) Large vision-language models possess good attribute recognition ability, but their hierarchical understanding ability is relatively limited.
We hope this work can help guide future progress in fine-grained visual understanding of large vision-language models.
arXiv Detail & Related papers (2024-08-25T17:42:05Z) - Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models [64.24227572048075]
We propose a Knowledge-Aware Prompt Tuning (KAPT) framework for vision-language models.
Our approach takes inspiration from human intelligence in which external knowledge is usually incorporated into recognizing novel categories of objects.
arXiv Detail & Related papers (2023-08-22T04:24:45Z) - Perceptual Grouping in Contrastive Vision-Language Models [59.1542019031645]
We show how vision-language models are able to understand where objects reside within an image and group together visually related parts of the imagery.
We propose a minimal set of modifications that results in models that uniquely learn both semantic and spatial information.
arXiv Detail & Related papers (2022-10-18T17:01:35Z) - Knowledge Graph Augmented Network Towards Multiview Representation
Learning for Aspect-based Sentiment Analysis [96.53859361560505]
We propose a knowledge graph augmented network (KGAN) to incorporate external knowledge with explicitly syntactic and contextual information.
KGAN captures the sentiment feature representations from multiple perspectives, i.e., context-, syntax- and knowledge-based.
Experiments on three popular ABSA benchmarks demonstrate the effectiveness and robustness of our KGAN.
arXiv Detail & Related papers (2022-01-13T08:25:53Z) - Efficient Multi-Modal Embeddings from Structured Data [0.0]
Multi-modal word semantics aims to enhance embeddings with perceptual input.
Visual grounding can contribute to linguistic applications as well.
New embedding conveys complementary information for text based embeddings.
arXiv Detail & Related papers (2021-10-06T08:42:09Z) - Understanding Synonymous Referring Expressions via Contrastive Features [105.36814858748285]
We develop an end-to-end trainable framework to learn contrastive features on the image and object instance levels.
We conduct extensive experiments to evaluate the proposed algorithm on several benchmark datasets.
arXiv Detail & Related papers (2021-04-20T17:56:24Z) - Quantifying Learnability and Describability of Visual Concepts Emerging
in Representation Learning [91.58529629419135]
We consider how to characterise visual groupings discovered automatically by deep neural networks.
We introduce two concepts, visual learnability and describability, that can be used to quantify the interpretability of arbitrary image groupings.
arXiv Detail & Related papers (2020-10-27T18:41:49Z) - Natural Language Rationales with Full-Stack Visual Reasoning: From
Pixels to Semantic Frames to Commonsense Graphs [106.15931418425906]
We present the first study focused on generating natural language rationales across several complex visual reasoning tasks.
We present RationaleVT Transformer, an integrated model that learns to generate free-text rationales by combining pretrained language models with object recognition, grounded visual semantic frames, and visual commonsense graphs.
Our experiments show that the base pretrained language model benefits from visual adaptation and that free-text rationalization is a promising research direction to complement model interpretability for complex visual-textual reasoning tasks.
arXiv Detail & Related papers (2020-10-15T05:08:56Z) - Natural language technology and query expansion: issues,
state-of-the-art and perspectives [0.0]
Linguistic characteristics that cause ambiguity and misinterpretation of queries as well as additional factors affect the users ability to accurately represent their information needs.
We lay down the anatomy of a generic linguistic based query expansion framework and propose its module-based decomposition.
For each of the modules we review the state-of-the-art solutions in the literature and categorized under the light of the techniques used.
arXiv Detail & Related papers (2020-04-23T11:39:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.