Computing a human-like reaction time metric from stable recurrent vision
models
- URL: http://arxiv.org/abs/2306.11582v2
- Date: Mon, 6 Nov 2023 16:39:38 GMT
- Title: Computing a human-like reaction time metric from stable recurrent vision
models
- Authors: Lore Goetschalckx, Lakshmi Narasimhan Govindarajan, Alekh Karkada
Ashok, Aarit Ahuja, David L. Sheinberg, Thomas Serre
- Abstract summary: We sketch a general-purpose methodology to construct computational accounts of reaction times from a stimulus-computable, task-optimized model.
We demonstrate that our metric aligns with patterns of human reaction times for stimulus manipulations across four disparate visual decision-making tasks.
This work paves the way for exploring the temporal alignment of model and human visual strategies in the context of various other cognitive tasks.
- Score: 11.87006916768365
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The meteoric rise in the adoption of deep neural networks as computational
models of vision has inspired efforts to "align" these models with humans. One
dimension of interest for alignment includes behavioral choices, but moving
beyond characterizing choice patterns to capturing temporal aspects of visual
decision-making has been challenging. Here, we sketch a general-purpose
methodology to construct computational accounts of reaction times from a
stimulus-computable, task-optimized model. Specifically, we introduce a novel
metric leveraging insights from subjective logic theory summarizing evidence
accumulation in recurrent vision models. We demonstrate that our metric aligns
with patterns of human reaction times for stimulus manipulations across four
disparate visual decision-making tasks spanning perceptual grouping, mental
simulation, and scene categorization. This work paves the way for exploring the
temporal alignment of model and human visual strategies in the context of
various other cognitive tasks toward generating testable hypotheses for
neuroscience. Links to the code and data can be found on the project page:
https://serre-lab.github.io/rnn_rts_site.
Related papers
- Aligning Neuronal Coding of Dynamic Visual Scenes with Foundation Vision Models [2.790870674964473]
We propose Vi-ST, atemporal convolutional neural network fed with a self-supervised Vision Transformer (ViT)
Our proposed Vi-ST demonstrates a novel modeling framework for neuronal coding of dynamic visual scenes in the brain.
arXiv Detail & Related papers (2024-07-15T14:06:13Z) - Automatic Discovery of Visual Circuits [66.99553804855931]
We explore scalable methods for extracting the subgraph of a vision model's computational graph that underlies recognition of a specific visual concept.
We find that our approach extracts circuits that causally affect model output, and that editing these circuits can defend large pretrained models from adversarial attacks.
arXiv Detail & Related papers (2024-04-22T17:00:57Z) - Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption [64.07607726562841]
Existing multi-person human reconstruction approaches mainly focus on recovering accurate poses or avoiding penetration.
In this work, we tackle the task of reconstructing closely interactive humans from a monocular video.
We propose to leverage knowledge from proxemic behavior and physics to compensate the lack of visual information.
arXiv Detail & Related papers (2024-04-17T11:55:45Z) - Temporal Conditioning Spiking Latent Variable Models of the Neural
Response to Natural Visual Scenes [29.592870472342337]
This work presents the temporal conditioning spiking latent variable models (TeCoS-LVM) to simulate the neural response to natural visual stimuli.
We use spiking neurons to produce spike outputs that directly match the recorded trains.
We show that TeCoS-LVM models can produce more realistic spike activities and accurately fit spike statistics than powerful alternatives.
arXiv Detail & Related papers (2023-06-21T06:30:18Z) - Modelling Human Visual Motion Processing with Trainable Motion Energy
Sensing and a Self-attention Network [1.9458156037869137]
We propose an image-computable model of human motion perception by bridging the gap between biological and computer vision models.
This model architecture aims to capture the computations in V1-MT, the core structure for motion perception in the biological visual system.
In silico neurophysiology reveals that our model's unit responses are similar to mammalian neural recordings regarding motion pooling and speed tuning.
arXiv Detail & Related papers (2023-05-16T04:16:07Z) - Adapting Brain-Like Neural Networks for Modeling Cortical Visual
Prostheses [68.96380145211093]
Cortical prostheses are devices implanted in the visual cortex that attempt to restore lost vision by electrically stimulating neurons.
Currently, the vision provided by these devices is limited, and accurately predicting the visual percepts resulting from stimulation is an open challenge.
We propose to address this challenge by utilizing 'brain-like' convolutional neural networks (CNNs), which have emerged as promising models of the visual system.
arXiv Detail & Related papers (2022-09-27T17:33:19Z) - LatentHuman: Shape-and-Pose Disentangled Latent Representation for Human
Bodies [78.17425779503047]
We propose a novel neural implicit representation for the human body.
It is fully differentiable and optimizable with disentangled shape and pose latent spaces.
Our model can be trained and fine-tuned directly on non-watertight raw data with well-designed losses.
arXiv Detail & Related papers (2021-11-30T04:10:57Z) - Drop, Swap, and Generate: A Self-Supervised Approach for Generating
Neural Activity [33.06823702945747]
We introduce a novel unsupervised approach for learning disentangled representations of neural activity called Swap-VAE.
Our approach combines a generative modeling framework with an instance-specific alignment loss.
We show that it is possible to build representations that disentangle neural datasets along relevant latent dimensions linked to behavior.
arXiv Detail & Related papers (2021-11-03T16:39:43Z) - The Neural Coding Framework for Learning Generative Models [91.0357317238509]
We propose a novel neural generative model inspired by the theory of predictive processing in the brain.
In a similar way, artificial neurons in our generative model predict what neighboring neurons will do, and adjust their parameters based on how well the predictions matched reality.
arXiv Detail & Related papers (2020-12-07T01:20:38Z) - Fooling the primate brain with minimal, targeted image manipulation [67.78919304747498]
We propose an array of methods for creating minimal, targeted image perturbations that lead to changes in both neuronal activity and perception as reflected in behavior.
Our work shares the same goal with adversarial attack, namely the manipulation of images with minimal, targeted noise that leads ANN models to misclassify the images.
arXiv Detail & Related papers (2020-11-11T08:30:54Z) - A Meta-Bayesian Model of Intentional Visual Search [0.0]
We propose a computational model of visual search that incorporates Bayesian interpretations of the neural mechanisms that underlie categorical perception and saccade planning.
To enable meaningful comparisons between simulated and human behaviours, we employ a gaze-contingent paradigm that required participants to classify occluded MNIST digits through a window that followed their gaze.
Our model is able to recapitulate human behavioural metrics such as classification accuracy while retaining a high degree of interpretability, which we demonstrate by recovering subject-specific parameters from observed human behaviour.
arXiv Detail & Related papers (2020-06-05T16:10:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.