Sensorimotor Visual Perception on Embodied System Using Free Energy
Principle
- URL: http://arxiv.org/abs/2006.06192v2
- Date: Tue, 22 Feb 2022 01:46:35 GMT
- Title: Sensorimotor Visual Perception on Embodied System Using Free Energy
Principle
- Authors: Kanako Esaki, Tadayuki Matsumura, Kiyoto Ito and Hiroyuki Mizuno
- Abstract summary: We propose an embodied system based on the free energy principle (FEP) for sensorimotor visual perception.
We evaluate it in a character-recognition task using the MNIST dataset.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose an embodied system based on the free energy principle (FEP) for
sensorimotor visual perception. We evaluated it in a character-recognition task
using the MNIST dataset. Although the FEP has successfully described a rule
that living things obey mathematically and claims that a biological system
continues to change its internal models and behaviors to minimize the
difference in predicting sensory input, it is not enough to model sensorimotor
visual perception. An embodiment of the system is the key to achieving
sensorimotor visual perception. The proposed embodied system is configured by a
body and memory. The body has an ocular motor system controlling the direction
of eye gaze, which means that the eye can only observe a small focused area of
the environment. The memory is not photographic, but is a generative model
implemented with a variational autoencoder that contains prior knowledge about
the environment, and that knowledge is classified. By limiting body and memory
abilities and operating according to the FEP, the embodied system repeatedly
takes action to obtain the next sensory input based on various potentials of
future sensory inputs. In the evaluation, the inference of the environment was
represented as an approximate posterior distribution of characters (0 - 9). As
the number of repetitions increased, the attention area moved continuously,
gradually reducing the uncertainty of characters. Finally, the probability of
the correct character became the highest among the characters. Changing the
initial attention position provides a different final distribution, suggesting
that the proposed system has a confirmation bias.
Related papers
- Emotion Recognition from the perspective of Activity Recognition [0.0]
Appraising human emotional states, behaviors, and reactions displayed in real-world settings can be accomplished using latent continuous dimensions.
For emotion recognition systems to be deployed and integrated into real-world mobile and computing devices, we need to consider data collected in the world.
We propose a novel three-stream end-to-end deep learning regression pipeline with an attention mechanism.
arXiv Detail & Related papers (2024-03-24T18:53:57Z) - Exploring Contextual Representation and Multi-Modality for End-to-End
Autonomous Driving [58.879758550901364]
Recent perception systems enhance spatial understanding with sensor fusion but often lack full environmental context.
We introduce a framework that integrates three cameras to emulate the human field of view, coupled with top-down bird-eye-view semantic data to enhance contextual representation.
Our method achieves displacement error by 0.67m in open-loop settings, surpassing current methods by 6.9% on the nuScenes dataset.
arXiv Detail & Related papers (2022-10-13T05:56:20Z) - SSTN: Self-Supervised Domain Adaptation Thermal Object Detection for
Autonomous Driving [6.810856082577402]
We have proposed a deep neural network Self Supervised Thermal Network (SSTN) for learning the feature embedding to maximize the information between visible and infrared spectrum domain by contrastive learning.
The proposed method is extensively evaluated on the two publicly available datasets: the FLIR-ADAS dataset and the KAIST Multi-Spectral dataset.
arXiv Detail & Related papers (2021-03-04T16:42:49Z) - Energy Aware Deep Reinforcement Learning Scheduling for Sensors
Correlated in Time and Space [62.39318039798564]
We propose a scheduling mechanism capable of taking advantage of correlated information.
The proposed mechanism is capable of determining the frequency with which sensors should transmit their updates.
We show that our solution can significantly extend the sensors' lifetime.
arXiv Detail & Related papers (2020-11-19T09:53:27Z) - AEGIS: A real-time multimodal augmented reality computer vision based
system to assist facial expression recognition for individuals with autism
spectrum disorder [93.0013343535411]
This paper presents the development of a multimodal augmented reality (AR) system which combines the use of computer vision and deep convolutional neural networks (CNN)
The proposed system, which we call AEGIS, is an assistive technology deployable on a variety of user devices including tablets, smartphones, video conference systems, or smartglasses.
We leverage both spatial and temporal information in order to provide an accurate expression prediction, which is then converted into its corresponding visualization and drawn on top of the original video frame.
arXiv Detail & Related papers (2020-10-22T17:20:38Z) - Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision
Action Recognition [131.6328804788164]
We propose a framework, named Semantics-aware Adaptive Knowledge Distillation Networks (SAKDN), to enhance action recognition in vision-sensor modality (videos)
The SAKDN uses multiple wearable-sensors as teacher modalities and uses RGB videos as student modality.
arXiv Detail & Related papers (2020-09-01T03:38:31Z) - Evaluating the Apperception Engine [31.071555696874054]
Apperception Engine is an unsupervised learning system.
It constructs a symbolic causal theory that both explains the sensory sequence and satisfies a set of unity conditions.
It can be applied to predict future sensor readings, retrodict earlier readings, or impute missing readings.
arXiv Detail & Related papers (2020-07-09T11:54:05Z) - Causal Discovery in Physical Systems from Videos [123.79211190669821]
Causal discovery is at the core of human cognition.
We consider the task of causal discovery from videos in an end-to-end fashion without supervision on the ground-truth graph structure.
arXiv Detail & Related papers (2020-07-01T17:29:57Z) - Towards a self-organizing pre-symbolic neural model representing
sensorimotor primitives [15.364871660385155]
The acquisition of symbolic and linguistic representations of sensorimotor behavior is a cognitive process performed by an agent.
We propose a model that relates the conceptualization of the higher-level information from visual stimuli to the development of ventral/dorsal visual streams.
We exemplify this model through a robot passively observing an object to learn its features and movements.
arXiv Detail & Related papers (2020-06-20T01:58:28Z) - Visualizing and Understanding Vision System [0.6510507449705342]
We use a vision recognition-reconstruction network (RRN) to investigate the development, recognition, learning and forgetting mechanisms.
In digit recognition study, we witness that the RRN could maintain object invariance representation under various viewing conditions.
In the learning and forgetting study, novel structure recognition is implemented by adjusting entire synapses in low magnitude while pattern specificities of original synaptic connectivity are preserved.
arXiv Detail & Related papers (2020-06-11T07:08:49Z) - Continuous Emotion Recognition via Deep Convolutional Autoencoder and
Support Vector Regressor [70.2226417364135]
It is crucial that the machine should be able to recognize the emotional state of the user with high accuracy.
Deep neural networks have been used with great success in recognizing emotions.
We present a new model for continuous emotion recognition based on facial expression recognition.
arXiv Detail & Related papers (2020-01-31T17:47:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.