Neural Photofit: Gaze-based Mental Image Reconstruction
- URL: http://arxiv.org/abs/2108.07524v1
- Date: Tue, 17 Aug 2021 09:11:32 GMT
- Title: Neural Photofit: Gaze-based Mental Image Reconstruction
- Authors: Florian Strohm, Ekta Sood, Sven Mayer, Philipp M\"uller, Mihai B\^ace,
Andreas Bulling
- Abstract summary: We propose a novel method that leverages human fixations to visually decode the image a person has in mind into a photofit (facial composite)
Our method combines three neural networks: An encoder, a scoring network, and a decoder.
We show that our method significantly outperforms a mean baseline predictor and report on a human study that shows that we can decode photofits that are visually plausible and close to the observer's mental image.
- Score: 25.67771238116104
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We propose a novel method that leverages human fixations to visually decode
the image a person has in mind into a photofit (facial composite). Our method
combines three neural networks: An encoder, a scoring network, and a decoder.
The encoder extracts image features and predicts a neural activation map for
each face looked at by a human observer. A neural scoring network compares the
human and neural attention and predicts a relevance score for each extracted
image feature. Finally, image features are aggregated into a single feature
vector as a linear combination of all features weighted by relevance which a
decoder decodes into the final photofit. We train the neural scoring network on
a novel dataset containing gaze data of 19 participants looking at collages of
synthetic faces. We show that our method significantly outperforms a mean
baseline predictor and report on a human study that shows that we can decode
photofits that are visually plausible and close to the observer's mental image.
Related papers
- Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Learning Multimodal Volumetric Features for Large-Scale Neuron Tracing [72.45257414889478]
We aim to reduce human workload by predicting connectivity between over-segmented neuron pieces.
We first construct a dataset, named FlyTracing, that contains millions of pairwise connections of segments expanding the whole fly brain.
We propose a novel connectivity-aware contrastive learning method to generate dense volumetric EM image embedding.
arXiv Detail & Related papers (2024-01-05T19:45:12Z) - Null Space Properties of Neural Networks with Applications to Image
Steganography [6.063583864878311]
The null space of a given neural network can tell us the part of the input data that makes no contribution to the final prediction.
One application described here leads to a method of image steganography.
arXiv Detail & Related papers (2024-01-01T03:32:28Z) - Evaluating alignment between humans and neural network representations in image-based learning tasks [5.657101730705275]
We tested how well the representations of $86$ pretrained neural network models mapped to human learning trajectories.
We found that while training dataset size was a core determinant of alignment with human choices, contrastive training with multi-modal data (text and imagery) was a common feature of currently publicly available models that predicted human generalisation.
In conclusion, pretrained neural networks can serve to extract representations for cognitive models, as they appear to capture some fundamental aspects of cognition that are transferable across tasks.
arXiv Detail & Related papers (2023-06-15T08:18:29Z) - Ponder: Point Cloud Pre-training via Neural Rendering [93.34522605321514]
We propose a novel approach to self-supervised learning of point cloud representations by differentiable neural encoders.
The learned point-cloud can be easily integrated into various downstream tasks, including not only high-level rendering tasks like 3D detection and segmentation, but low-level tasks like 3D reconstruction and image rendering.
arXiv Detail & Related papers (2022-12-31T08:58:39Z) - Neural Novel Actor: Learning a Generalized Animatable Neural
Representation for Human Actors [98.24047528960406]
We propose a new method for learning a generalized animatable neural representation from a sparse set of multi-view imagery of multiple persons.
The learned representation can be used to synthesize novel view images of an arbitrary person from a sparse set of cameras, and further animate them with the user's pose control.
arXiv Detail & Related papers (2022-08-25T07:36:46Z) - The Brain-Inspired Decoder for Natural Visual Image Reconstruction [4.433315630787158]
We propose a deep learning neural network architecture with biological properties to reconstruct visual image from spike trains.
Our model is an end-to-end decoder from neural spike trains to images.
Our results show that our method can effectively combine receptive field features to reconstruct images.
arXiv Detail & Related papers (2022-07-18T13:31:26Z) - Learning Compositional Representations for Effective Low-Shot
Generalization [45.952867474500145]
We propose Recognition as Part Composition (RPC), an image encoding approach inspired by human cognition.
RPC encodes images by first decomposing them into salient parts, and then encoding each part as a mixture of a small number of prototypes.
We find that this type of learning can overcome hurdles faced by deep convolutional networks in low-shot generalization tasks.
arXiv Detail & Related papers (2022-04-17T21:31:11Z) - Neural Texture Extraction and Distribution for Controllable Person Image
Synthesis [46.570170624026595]
We deal with controllable person image synthesis task which aims to re-render a human from a reference image with explicit control over body pose and appearance.
Observing that person images are highly structured, we propose to generate desired images by extracting and distributing semantic entities of reference images.
arXiv Detail & Related papers (2022-04-13T03:51:07Z) - Neural Body: Implicit Neural Representations with Structured Latent
Codes for Novel View Synthesis of Dynamic Humans [56.63912568777483]
This paper addresses the challenge of novel view synthesis for a human performer from a very sparse set of camera views.
We propose Neural Body, a new human body representation which assumes that the learned neural representations at different frames share the same set of latent codes anchored to a deformable mesh.
Experiments on ZJU-MoCap show that our approach outperforms prior works by a large margin in terms of novel view synthesis quality.
arXiv Detail & Related papers (2020-12-31T18:55:38Z) - Neural Sparse Representation for Image Restoration [116.72107034624344]
Inspired by the robustness and efficiency of sparse coding based image restoration models, we investigate the sparsity of neurons in deep networks.
Our method structurally enforces sparsity constraints upon hidden neurons.
Experiments show that sparse representation is crucial in deep neural networks for multiple image restoration tasks.
arXiv Detail & Related papers (2020-06-08T05:15:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.