Visual resemblance and communicative context constrain the emergence of
graphical conventions
- URL: http://arxiv.org/abs/2109.13861v1
- Date: Fri, 17 Sep 2021 23:05:36 GMT
- Title: Visual resemblance and communicative context constrain the emergence of
graphical conventions
- Authors: Robert D. Hawkins, Megumi Sano, Noah D. Goodman, Judith E. Fan
- Abstract summary: Drawing provides a versatile medium for communicating about the visual world.
Do viewers understand drawings based solely on their ability to resemble the entities they refer to (i.e., as images)?
Do they understand drawings based on shared but arbitrary associations with these entities (i.e. as symbols)?
- Score: 21.976382800327965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: From photorealistic sketches to schematic diagrams, drawing provides a
versatile medium for communicating about the visual world. How do images
spanning such a broad range of appearances reliably convey meaning? Do viewers
understand drawings based solely on their ability to resemble the entities they
refer to (i.e., as images), or do they understand drawings based on shared but
arbitrary associations with these entities (i.e., as symbols)? In this paper,
we provide evidence for a cognitive account of pictorial meaning in which both
visual and social information is integrated to support effective visual
communication. To evaluate this account, we used a communication task where
pairs of participants used drawings to repeatedly communicate the identity of a
target object among multiple distractor objects. We manipulated social cues
across three experiments and a full internal replication, finding pairs of
participants develop referent-specific and interaction-specific strategies for
communicating more efficiently over time, going beyond what could be explained
by either task practice or a pure resemblance-based account alone. Using a
combination of model-based image analyses and crowdsourced sketch annotations,
we further determined that drawings did not drift toward arbitrariness, as
predicted by a pure convention-based account, but systematically preserved
those visual features that were most distinctive of the target object. Taken
together, these findings advance theories of pictorial meaning and have
implications for how successful graphical conventions emerge via complex
interactions between visual perception, communicative experience, and social
context.
Related papers
- When Does Perceptual Alignment Benefit Vision Representations? [76.32336818860965]
We investigate how aligning vision model representations to human perceptual judgments impacts their usability.
We find that aligning models to perceptual judgments yields representations that improve upon the original backbones across many downstream tasks.
Our results suggest that injecting an inductive bias about human perceptual knowledge into vision models can contribute to better representations.
arXiv Detail & Related papers (2024-10-14T17:59:58Z) - Mind the GAP: Glimpse-based Active Perception improves generalization and sample efficiency of visual reasoning [0.7999703756441756]
Human capabilities in understanding visual relations are far superior to those of AI systems.
We develop a system equipped with a novel Glimpse-based Active Perception (GAP)
The results suggest that the GAP is essential for extracting visual relations that go beyond the immediate visual content.
arXiv Detail & Related papers (2024-09-30T11:48:11Z) - For a semiotic AI: Bridging computer vision and visual semiotics for computational observation of large scale facial image archives [3.418398936676879]
This work presents FRESCO, a framework designed to explore the socio-cultural implications of images on social media platforms at scale.
FRESCO deconstructs images into numerical and categorical variables using state-of-the-art computer vision techniques.
The framework analyzes images across three levels: the plastic level, encompassing fundamental visual features like lines and colors; the figurative level, representing specific entities or concepts; and the enunciation level, which focuses particularly on constructing the point of view of the spectator and observer.
arXiv Detail & Related papers (2024-07-03T16:57:38Z) - CLiC: Concept Learning in Context [54.81654147248919]
This paper builds upon recent advancements in visual concept learning.
It involves acquiring a visual concept from a source image and subsequently applying it to an object in a target image.
To localize the concept learning, we employ soft masks that contain both the concept within the mask and the surrounding image area.
arXiv Detail & Related papers (2023-11-28T01:33:18Z) - Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for
Improved Vision-Language Compositionality [50.48859793121308]
Contrastively trained vision-language models have achieved remarkable progress in vision and language representation learning.
Recent research has highlighted severe limitations in their ability to perform compositional reasoning over objects, attributes, and relations.
arXiv Detail & Related papers (2023-05-23T08:28:38Z) - MetaCLUE: Towards Comprehensive Visual Metaphors Research [43.604408485890275]
We introduce MetaCLUE, a set of vision tasks on visual metaphor.
We perform a comprehensive analysis of state-of-the-art models in vision and language based on our annotations.
We hope this work provides a concrete step towards developing AI systems with human-like creative capabilities.
arXiv Detail & Related papers (2022-12-19T22:41:46Z) - Emergent Graphical Conventions in a Visual Communication Game [80.79297387339614]
Humans communicate with graphical sketches apart from symbolic languages.
We take the very first step to model and simulate such an evolution process via two neural agents playing a visual communication game.
We devise a novel reinforcement learning method such that agents are evolved jointly towards successful communication and abstract graphical conventions.
arXiv Detail & Related papers (2021-11-28T18:59:57Z) - Constellation: Learning relational abstractions over objects for
compositional imagination [64.99658940906917]
We introduce Constellation, a network that learns relational abstractions of static visual scenes.
This work is a first step in the explicit representation of visual relationships and using them for complex cognitive procedures.
arXiv Detail & Related papers (2021-07-23T11:59:40Z) - Exploring Visual Engagement Signals for Representation Learning [56.962033268934015]
We present VisE, a weakly supervised learning approach, which maps social images to pseudo labels derived by clustered engagement signals.
We then study how models trained in this way benefit subjective downstream computer vision tasks such as emotion recognition or political bias detection.
arXiv Detail & Related papers (2021-04-15T20:50:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.