Generating Memorable Images Based on Human Visual Memory Schemas
- URL: http://arxiv.org/abs/2005.02969v1
- Date: Wed, 6 May 2020 17:23:44 GMT
- Title: Generating Memorable Images Based on Human Visual Memory Schemas
- Authors: Cameron Kyle-Davidson, Adrian G. Bors, Karla K. Evans
- Abstract summary: This research study proposes using Generative Adversarial Networks (GAN) to generate memorable or non-memorable images of scenes.
The memorability of the generated images is evaluated by modelling Visual Memorys (VMS), which correspond to mental representations that human observers use to encode an image into memory.
- Score: 9.986390874391095
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This research study proposes using Generative Adversarial Networks (GAN) that
incorporate a two-dimensional measure of human memorability to generate
memorable or non-memorable images of scenes. The memorability of the generated
images is evaluated by modelling Visual Memory Schemas (VMS), which correspond
to mental representations that human observers use to encode an image into
memory. The VMS model is based upon the results of memory experiments conducted
on human observers, and provides a 2D map of memorability. We impose a
memorability constraint upon the latent space of a GAN by employing a VMS map
prediction model as an auxiliary loss. We assess the difference in memorability
between images generated to be memorable or non-memorable through an
independent computational measure of memorability, and additionally assess the
effect of memorability on the realness of the generated images.
Related papers
- Modeling Visual Memorability Assessment with Autoencoders Reveals Characteristics of Memorable Images [2.4861619769660637]
Image memorability refers to the phenomenon where certain images are more likely to be remembered than others.
We modeled the subjective experience of visual memorability using an autoencoder based on VGG16 Convolutional Neural Networks (CNNs)
We investigated the relationship between memorability and reconstruction error, assessed latent space representations distinctiveness, and developed a Gated Recurrent Unit (GRU) model to predict memorability likelihood.
arXiv Detail & Related papers (2024-10-19T22:58:33Z) - When Does Perceptual Alignment Benefit Vision Representations? [76.32336818860965]
We investigate how aligning vision model representations to human perceptual judgments impacts their usability.
We find that aligning models to perceptual judgments yields representations that improve upon the original backbones across many downstream tasks.
Our results suggest that injecting an inductive bias about human perceptual knowledge into vision models can contribute to better representations.
arXiv Detail & Related papers (2024-10-14T17:59:58Z) - Mind-to-Image: Projecting Visual Mental Imagination of the Brain from fMRI [36.181302575642306]
Reconstructing visual imagination presents a greater challenge, with potentially revolutionary applications.
For the first time, we have compiled a substantial dataset (around 6h of scans) on visual imagery.
We train a modified version of an fMRI-to-image model and demonstrate the feasibility of reconstructing images from two modes of imagination.
arXiv Detail & Related papers (2024-04-08T12:46:39Z) - Psychometry: An Omnifit Model for Image Reconstruction from Human Brain Activity [60.983327742457995]
Reconstructing the viewed images from human brain activity bridges human and computer vision through the Brain-Computer Interface.
We devise Psychometry, an omnifit model for reconstructing images from functional Magnetic Resonance Imaging (fMRI) obtained from different subjects.
arXiv Detail & Related papers (2024-03-29T07:16:34Z) - Decoding Realistic Images from Brain Activity with Contrastive
Self-supervision and Latent Diffusion [29.335943994256052]
Reconstructing visual stimuli from human brain activities provides a promising opportunity to advance our understanding of the brain's visual system.
We propose a two-phase framework named Contrast and Diffuse (CnD) to decode realistic images from functional magnetic resonance imaging (fMRI) recordings.
arXiv Detail & Related papers (2023-09-30T09:15:22Z) - Controllable Mind Visual Diffusion Model [58.83896307930354]
Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models.
We propose a novel approach, referred to as Controllable Mind Visual Model Diffusion (CMVDM)
CMVDM extracts semantic and silhouette information from fMRI data using attribute alignment and assistant networks.
We then leverage a control model to fully exploit the extracted information for image synthesis, resulting in generated images that closely resemble the visual stimuli in terms of semantics and silhouette.
arXiv Detail & Related papers (2023-05-17T11:36:40Z) - Improving Image Recognition by Retrieving from Web-Scale Image-Text Data [68.63453336523318]
We introduce an attention-based memory module, which learns the importance of each retrieved example from the memory.
Compared to existing approaches, our method removes the influence of the irrelevant retrieved examples, and retains those that are beneficial to the input query.
We show that it achieves state-of-the-art accuracies in ImageNet-LT, Places-LT and Webvision datasets.
arXiv Detail & Related papers (2023-04-11T12:12:05Z) - Seeing Beyond the Brain: Conditional Diffusion Model with Sparse Masked
Modeling for Vision Decoding [0.0]
We present MinD-Vis: Sparse Masked Brain Modeling with Double-Conditioned Latent Diffusion Model for Human Vision Decoding.
We show that MinD-Vis can reconstruct highly plausible images with semantically matching details from brain recordings using very few paired annotations.
arXiv Detail & Related papers (2022-11-13T17:04:05Z) - A domain adaptive deep learning solution for scanpath prediction of
paintings [66.46953851227454]
This paper focuses on the eye-movement analysis of viewers during the visual experience of a certain number of paintings.
We introduce a new approach to predicting human visual attention, which impacts several cognitive functions for humans.
The proposed new architecture ingests images and returns scanpaths, a sequence of points featuring a high likelihood of catching viewers' attention.
arXiv Detail & Related papers (2022-09-22T22:27:08Z) - HM4: Hidden Markov Model with Memory Management for Visual Place
Recognition [54.051025148533554]
We develop a Hidden Markov Model approach for visual place recognition in autonomous driving.
Our algorithm, dubbed HM$4$, exploits temporal look-ahead to transfer promising candidate images between passive storage and active memory.
We show that this allows constant time and space inference for a fixed coverage area.
arXiv Detail & Related papers (2020-11-01T08:49:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.