Related papers: Machine Psychophysics: Cognitive Control in Vision-Language Models

Machine Psychophysics: Cognitive Control in Vision-Language Models

URL: http://arxiv.org/abs/2505.18969v1
Date: Sun, 25 May 2025 04:23:28 GMT
Title: Machine Psychophysics: Cognitive Control in Vision-Language Models
Authors: Dezhi Luo, Maijunxian Wang, Bingyang Wang, Tianwei Zhao, Yijiang Li, Hokin Deng,
Abstract summary: We evaluate 108 vision-language models on three classic conflict tasks and their more demanding "squared" variants across 2,220 trials.<n>Results indicate that some form of human-like executive function have emerged in current multi-modal foundational models.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Cognitive control refers to the ability to flexibly coordinate thought and action in pursuit of internal goals. A standard method for assessing cognitive control involves conflict tasks that contrast congruent and incongruent trials, measuring the ability to prioritize relevant information while suppressing interference. We evaluate 108 vision-language models on three classic conflict tasks and their more demanding "squared" variants across 2,220 trials. Model performance corresponds closely to human behavior under resource constraints and reveals individual differences. These results indicate that some form of human-like executive function have emerged in current multi-modal foundational models.

Related papers

Dynamic Programming Techniques for Enhancing Cognitive Representation in Knowledge Tracing [125.75923987618977]
We propose the Cognitive Representation Dynamic Programming based Knowledge Tracing (CRDP-KT) model.<n>It is a dynamic programming algorithm to optimize cognitive representations based on the difficulty of the questions and the performance intervals between them.<n>It provides more accurate and systematic input features for subsequent model training, thereby minimizing distortion in the simulation of cognitive states.
arXiv Detail & Related papers (2025-06-03T14:44:48Z)
Synthesizing Images on Perceptual Boundaries of ANNs for Uncovering and Manipulating Human Perceptual Variability [8.068477554057475]
Human decision-making in cognitive tasks and daily life exhibits considerable variability, shaped by factors such as task difficulty, individual preferences, and personal experiences.<n>We present a computational framework that combines perceptual boundary sampling in ANNs and human behavioral experiments to investigate this phenomenon.<n>Our perceptual boundary sampling algorithm generates stimuli along ANN decision boundaries that intrinsically induce significant perceptual variability.
arXiv Detail & Related papers (2025-05-06T15:44:42Z)
Testing the limits of fine-tuning to improve reasoning in vision language models [51.58859621164201]
We introduce visual stimuli and human judgments on visual cognition tasks to evaluate performance across cognitive domains.<n>We fine-tune models on ground truth data for intuitive physics and causal reasoning.<n>We find that fine-tuning does not contribute to robust human-like generalization to data with other visual characteristics or to tasks in other cognitive domains.
arXiv Detail & Related papers (2025-02-21T18:58:30Z)
Mimicking Human Intuition: Cognitive Belief-Driven Q-Learning [5.960184723807347]
We propose Cognitive Belief-Driven Q-Learning (CBDQ), which integrates subjective belief modeling into the Q-learning framework. CBDQ enhances decision-making accuracy by endowing agents with human-like learning and reasoning capabilities. We evaluate the proposed method on discrete control benchmark tasks in various complicate environments.
arXiv Detail & Related papers (2024-10-02T16:50:29Z)
Benchmarking Continual Learning from Cognitive Perspectives [14.867136605254975]
Continual learning addresses the problem of continuously acquiring and transferring knowledge without catastrophic forgetting of old concepts. There is a mismatch between cognitive properties and evaluation methods of continual learning models. We propose to integrate model cognitive capacities and evaluation metrics into a unified evaluation paradigm.
arXiv Detail & Related papers (2023-12-06T06:27:27Z)
Enhancing HOI Detection with Contextual Cues from Large Vision-Language Models [56.257840490146]
ConCue is a novel approach for improving visual feature extraction in HOI detection. We develop a transformer-based feature extraction module with a multi-tower architecture that integrates contextual cues into both instance and interaction detectors.
arXiv Detail & Related papers (2023-11-26T09:11:32Z)
Decoding Susceptibility: Modeling Misbelief to Misinformation Through a Computational Approach [61.04606493712002]
Susceptibility to misinformation describes the degree of belief in unverifiable claims that is not observable. Existing susceptibility studies heavily rely on self-reported beliefs. We propose a computational approach to model users' latent susceptibility levels.
arXiv Detail & Related papers (2023-11-16T07:22:56Z)
Explaining by Imitating: Understanding Decisions by Interpretable Policy Learning [72.80902932543474]
Understanding human behavior from observed data is critical for transparency and accountability in decision-making. Consider real-world settings such as healthcare, in which modeling a decision-maker's policy is challenging. We propose a data-driven representation of decision-making behavior that inheres transparency by design, accommodates partial observability, and operates completely offline.
arXiv Detail & Related papers (2023-10-28T13:06:14Z)
Exploring the Trade-off between Plausibility, Change Intensity and Adversarial Power in Counterfactual Explanations using Multi-objective Optimization [73.89239820192894]
We argue that automated counterfactual generation should regard several aspects of the produced adversarial instances. We present a novel framework for the generation of counterfactual examples.
arXiv Detail & Related papers (2022-05-20T15:02:53Z)
Modeling Human Behavior Part II -- Cognitive approaches and Uncertainty [0.0]
In Part I, we discussed methods which generate a model of behavior from exploration of the system and feedback based on the exhibited behavior. In this work, we will continue the discussion from the perspective of methods which focus on the assumed cognitive abilities, limitations, and biases demonstrated in human reasoning.
arXiv Detail & Related papers (2022-05-13T07:29:15Z)
Empirical Estimates on Hand Manipulation are Recoverable: A Step Towards Individualized and Explainable Robotic Support in Everyday Activities [80.37857025201036]
Key challenge for robotic systems is to figure out the behavior of another agent. Processing correct inferences is especially challenging when (confounding) factors are not controlled experimentally. We propose equipping robots with the necessary tools to conduct observational studies on people.
arXiv Detail & Related papers (2022-01-27T22:15:56Z)
A Bayesian Account of Measures of Interpretability in Human-AI Interaction [34.99424576619341]
Existing approaches for the design of interpretable agent behavior consider different measures of interpretability in isolation. We propose a revised model where all these behaviors can be meaningfully modeled together. We will highlight interesting consequences of this unified model and motivate, through results of a user study.
arXiv Detail & Related papers (2020-11-22T03:28:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.