Visual Knowledge Tracing
- URL: http://arxiv.org/abs/2207.10157v2
- Date: Fri, 22 Jul 2022 00:28:40 GMT
- Title: Visual Knowledge Tracing
- Authors: Neehar Kondapaneni, Pietro Perona, Oisin Mac Aodha
- Abstract summary: We propose a novel task of tracing the evolving classification behavior of human learners.
We propose models that jointly extract the visual features used by learners as well as predicting the classification functions they utilize.
Our results show that our recurrent models are able to predict the classification behavior of human learners on three challenging medical image and species identification tasks.
- Score: 26.446317829793454
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Each year, thousands of people learn new visual categorization tasks --
radiologists learn to recognize tumors, birdwatchers learn to distinguish
similar species, and crowd workers learn how to annotate valuable data for
applications like autonomous driving. As humans learn, their brain updates the
visual features it extracts and attend to, which ultimately informs their final
classification decisions. In this work, we propose a novel task of tracing the
evolving classification behavior of human learners as they engage in
challenging visual classification tasks. We propose models that jointly extract
the visual features used by learners as well as predicting the classification
functions they utilize. We collect three challenging new datasets from real
human learners in order to evaluate the performance of different visual
knowledge tracing methods. Our results show that our recurrent models are able
to predict the classification behavior of human learners on three challenging
medical image and species identification tasks.
Related papers
- Evaluating Multiview Object Consistency in Humans and Image Models [68.36073530804296]
We leverage an experimental design from the cognitive sciences which requires zero-shot visual inferences about object shape.
We collect 35K trials of behavioral data from over 500 participants.
We then evaluate the performance of common vision models.
arXiv Detail & Related papers (2024-09-09T17:59:13Z) - Wills Aligner: A Robust Multi-Subject Brain Representation Learner [19.538200208523467]
We introduce Wills Aligner, a robust multi-subject brain representation learner.
Wills Aligner initially aligns different subjects' brains at the anatomical level.
It incorporates a mixture of brain experts to learn individual cognition patterns.
arXiv Detail & Related papers (2024-04-20T06:01:09Z) - Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models [64.24227572048075]
We propose a Knowledge-Aware Prompt Tuning (KAPT) framework for vision-language models.
Our approach takes inspiration from human intelligence in which external knowledge is usually incorporated into recognizing novel categories of objects.
arXiv Detail & Related papers (2023-08-22T04:24:45Z) - Evaluating alignment between humans and neural network representations in image-based learning tasks [5.657101730705275]
We tested how well the representations of $86$ pretrained neural network models mapped to human learning trajectories.
We found that while training dataset size was a core determinant of alignment with human choices, contrastive training with multi-modal data (text and imagery) was a common feature of currently publicly available models that predicted human generalisation.
In conclusion, pretrained neural networks can serve to extract representations for cognitive models, as they appear to capture some fundamental aspects of cognition that are transferable across tasks.
arXiv Detail & Related papers (2023-06-15T08:18:29Z) - Learning Transferable Pedestrian Representation from Multimodal
Information Supervision [174.5150760804929]
VAL-PAT is a novel framework that learns transferable representations to enhance various pedestrian analysis tasks with multimodal information.
We first perform pre-training on LUPerson-TA dataset, where each image contains text and attribute annotations.
We then transfer the learned representations to various downstream tasks, including person reID, person attribute recognition and text-based person search.
arXiv Detail & Related papers (2023-04-12T01:20:58Z) - Continual Learning with Bayesian Model based on a Fixed Pre-trained
Feature Extractor [55.9023096444383]
Current deep learning models are characterised by catastrophic forgetting of old knowledge when learning new classes.
Inspired by the process of learning new knowledge in human brains, we propose a Bayesian generative model for continual learning.
arXiv Detail & Related papers (2022-04-28T08:41:51Z) - Challenges and Opportunities for Machine Learning Classification of
Behavior and Mental State from Images [3.7445390865272588]
Computer Vision (CV) classifiers distinguish and detect nonverbal social human behavior and mental state.
There are several pain points which arise when attempting this process for behavioral phenotyping.
We discuss current state-of-the-art research endeavors in CV such as data curation, data augmentation, crowdsourced labeling, active learning, reinforcement learning, generative models, representation learning, federated learning, and meta-learning.
arXiv Detail & Related papers (2022-01-26T21:35:17Z) - SEGA: Semantic Guided Attention on Visual Prototype for Few-Shot
Learning [85.2093650907943]
We propose SEmantic Guided Attention (SEGA) to teach machines to recognize a new category.
SEGA uses semantic knowledge to guide the visual perception in a top-down manner about what visual features should be paid attention to.
We show that our semantic guided attention realizes anticipated function and outperforms state-of-the-art results.
arXiv Detail & Related papers (2021-11-08T08:03:44Z) - Passive attention in artificial neural networks predicts human visual
selectivity [8.50463394182796]
We show that passive attention techniques reveal a significant overlap with human visual selectivity estimates.
We validate these correlational results with causal manipulations using recognition experiments.
This work contributes a new approach to evaluating the biological and psychological validity of leading ANNs as models of human vision.
arXiv Detail & Related papers (2021-07-14T21:21:48Z) - Classifying Eye-Tracking Data Using Saliency Maps [8.524684315458245]
This paper proposes a visual saliency based novel feature extraction method for automatic and quantitative classification of eye-tracking data.
Comparing the saliency amplitudes, similarity and dissimilarity of saliency maps with the corresponding eye fixations maps gives an extra dimension of information which is effectively utilized to generate discriminative features to classify the eye-tracking data.
arXiv Detail & Related papers (2020-10-24T15:18:07Z) - What Can You Learn from Your Muscles? Learning Visual Representation
from Human Interactions [50.435861435121915]
We use human interaction and attention cues to investigate whether we can learn better representations compared to visual-only representations.
Our experiments show that our "muscly-supervised" representation outperforms a visual-only state-of-the-art method MoCo.
arXiv Detail & Related papers (2020-10-16T17:46:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.