Related papers: FutureVision: A methodology for the investigation of future cognition

FutureVision: A methodology for the investigation of future cognition

URL: http://arxiv.org/abs/2502.01597v1
Date: Mon, 03 Feb 2025 18:29:06 GMT
Title: FutureVision: A methodology for the investigation of future cognition
Authors: Tiago Timponi Torrent, Mark Turner, Nicolás Hinrichs, Frederico Belcavello, Igor Lourenço, Arthur Lorenzi Almeida, Marcelo Viridiano, Ely Edison Matos,
Abstract summary: We conduct a pilot study examining how visual fixation patterns vary during the evaluation of futuristic scenarios.<n>Preliminary results show that far-future and pessimistic scenarios are associated with longer fixations and more erratic saccades.
Score: 0.5644620681963636
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents a methodology combining multimodal semantic analysis with an eye-tracking experimental protocol to investigate the cognitive effort involved in understanding the communication of future scenarios. To demonstrate the methodology, we conduct a pilot study examining how visual fixation patterns vary during the evaluation of valence and counterfactuality in fictional ad pieces describing futuristic scenarios, using a portable eye tracker. Participants eye movements are recorded while evaluating the stimuli and describing them to a conversation partner. Gaze patterns are analyzed alongside semantic representations of the stimuli and participants descriptions, constructed from a frame semantic annotation of both linguistic and visual modalities. Preliminary results show that far-future and pessimistic scenarios are associated with longer fixations and more erratic saccades, supporting the hypothesis that fractures in the base spaces underlying the interpretation of future scenarios increase cognitive load for comprehenders.

Related papers

Rethinking Causal Mask Attention for Vision-Language Inference [17.450072268270773]
We investigate how different causal masking strategies affect vision-language inference.<n>We propose a family of future-aware attentions tailored for this setting.<n>We show that selectively compressing future semantic context into past representations benefits the inference.
arXiv Detail & Related papers (2025-05-24T08:59:28Z)
VisTopics: A Visual Semantic Unsupervised Approach to Topic Modeling of Video and Image Data [0.0]
This study introduces VisTopics, a computational framework designed to analyze large-scale visual datasets.<n>Applying VisTopics to a dataset of 452 NBC News videos resulted in reducing 11,070 frames to 6,928 deduplicated frames, which were then semantically analyzed to uncover 35 topics.
arXiv Detail & Related papers (2025-05-20T19:59:41Z)
SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation [33.02002580363215]
Vision-language temporal alignment is a crucial capability for human dynamic recognition and cognition in real-world scenarios. We introduce SVLTA, the Synthetic Vision-Language Temporal Alignment derived via a well-designed and feasible control generation method within a simulation environment. Our experiments reveal diagnostic insights through the evaluations in temporal question answering, distributional shift sensitiveness, and temporal alignment adaptation.
arXiv Detail & Related papers (2025-04-08T11:31:37Z)
Visual In-Context Learning for Large Vision-Language Models [62.5507897575317]
In Large Visual Language Models (LVLMs) the efficacy of In-Context Learning (ICL) remains limited by challenges in cross-modal interactions and representation disparities. We introduce a novel Visual In-Context Learning (VICL) method comprising Visual Demonstration Retrieval, Intent-Oriented Image Summarization, and Intent-Oriented Demonstration Composition. Our approach retrieves images via ''Retrieval & Rerank'' paradigm, summarises images with task intent and task-specific visual parsing, and composes language-based demonstrations.
arXiv Detail & Related papers (2024-02-18T12:43:38Z)
Understanding Before Recommendation: Semantic Aspect-Aware Review Exploitation via Large Language Models [53.337728969143086]
Recommendation systems harness user-item interactions like clicks and reviews to learn their representations. Previous studies improve recommendation accuracy and interpretability by modeling user preferences across various aspects and intents. We introduce a chain-based prompting approach to uncover semantic aspect-aware interactions.
arXiv Detail & Related papers (2023-12-26T15:44:09Z)
Integrating large language models and active inference to understand eye movements in reading and dyslexia [0.0]
We present a novel computational model employing hierarchical active inference to simulate reading and eye movements. Our model permits the exploration of maladaptive inference effects on eye movements during reading, such as in dyslexia.
arXiv Detail & Related papers (2023-08-09T13:16:30Z)
Inverse Dynamics Pretraining Learns Good Representations for Multitask Imitation [66.86987509942607]
We evaluate how such a paradigm should be done in imitation learning. We consider a setting where the pretraining corpus consists of multitask demonstrations. We argue that inverse dynamics modeling is well-suited to this setting.
arXiv Detail & Related papers (2023-05-26T14:40:46Z)
Localization vs. Semantics: Visual Representations in Unimodal and Multimodal Models [57.08925810659545]
We conduct a comparative analysis of the visual representations in existing vision-and-language models and vision-only models. Our empirical observations suggest that vision-and-language models are better at label prediction tasks. We hope our study sheds light on the role of language in visual learning, and serves as an empirical guide for various pretrained models.
arXiv Detail & Related papers (2022-12-01T05:00:18Z)
Learnable Visual Words for Interpretable Image Recognition [70.85686267987744]
We propose the Learnable Visual Words (LVW) to interpret the model prediction behaviors with two novel modules. The semantic visual words learning relaxes the category-specific constraint, enabling the general visual words shared across different categories. Our experiments on six visual benchmarks demonstrate the superior effectiveness of our proposed LVW in both accuracy and model interpretation.
arXiv Detail & Related papers (2022-05-22T03:24:45Z)
Deep Co-Attention Network for Multi-View Subspace Learning [73.3450258002607]
We propose a deep co-attention network for multi-view subspace learning. It aims to extract both the common information and the complementary information in an adversarial setting. In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation.
arXiv Detail & Related papers (2021-02-15T18:46:44Z)
A Meta-Bayesian Model of Intentional Visual Search [0.0]
We propose a computational model of visual search that incorporates Bayesian interpretations of the neural mechanisms that underlie categorical perception and saccade planning. To enable meaningful comparisons between simulated and human behaviours, we employ a gaze-contingent paradigm that required participants to classify occluded MNIST digits through a window that followed their gaze. Our model is able to recapitulate human behavioural metrics such as classification accuracy while retaining a high degree of interpretability, which we demonstrate by recovering subject-specific parameters from observed human behaviour.
arXiv Detail & Related papers (2020-06-05T16:10:35Z)
Analysing Lexical Semantic Change with Contextualised Word Representations [7.071298726856781]
We propose a novel method that exploits the BERT neural language model to obtain representations of word usages. We create a new evaluation dataset and show that the model representations and the detected semantic shifts are positively correlated with human judgements.
arXiv Detail & Related papers (2020-04-29T12:18:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.