The Psychophysics of Human Three-Dimensional Active Visuospatial
Problem-Solving
- URL: http://arxiv.org/abs/2306.11756v1
- Date: Mon, 19 Jun 2023 19:36:42 GMT
- Title: The Psychophysics of Human Three-Dimensional Active Visuospatial
Problem-Solving
- Authors: Markus D. Solbach, John K. Tsotsos
- Abstract summary: Are two physical 3D objects visually the same?
Humans are remarkably good at this task without any training, with a mean accuracy of 93.82%.
No learning effect was observed on accuracy after many trials, but some effect was seen for response time, number of fixations and extent of head movement.
- Score: 12.805267089186533
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Our understanding of how visual systems detect, analyze and interpret visual
stimuli has advanced greatly. However, the visual systems of all animals do
much more; they enable visual behaviours. How well the visual system performs
while interacting with the visual environment and how vision is used in the
real world have not been well studied, especially in humans. It has been
suggested that comparison is the most primitive of psychophysical tasks. Thus,
as a probe into these active visual behaviours, we use a same-different task:
are two physical 3D objects visually the same? This task seems to be a
fundamental cognitive ability. We pose this question to human subjects who are
free to move about and examine two real objects in an actual 3D space. Past
work has dealt solely with a 2D static version of this problem. We have
collected detailed, first-of-its-kind data of humans performing a visuospatial
task in hundreds of trials. Strikingly, humans are remarkably good at this task
without any training, with a mean accuracy of 93.82%. No learning effect was
observed on accuracy after many trials, but some effect was seen for response
time, number of fixations and extent of head movement. Subjects demonstrated a
variety of complex strategies involving a range of movement and eye fixation
changes, suggesting that solutions were developed dynamically and tailored to
the specific task.
Related papers
- Brain3D: Generating 3D Objects from fMRI [76.41771117405973]
We design a novel 3D object representation learning method, Brain3D, that takes as input the fMRI data of a subject.
We show that our model captures the distinct functionalities of each region of human vision system.
Preliminary evaluations indicate that Brain3D can successfully identify the disordered brain regions in simulated scenarios.
arXiv Detail & Related papers (2024-05-24T06:06:11Z) - Neural feels with neural fields: Visuo-tactile perception for in-hand
manipulation [57.60490773016364]
We combine vision and touch sensing on a multi-fingered hand to estimate an object's pose and shape during in-hand manipulation.
Our method, NeuralFeels, encodes object geometry by learning a neural field online and jointly tracks it by optimizing a pose graph problem.
Our results demonstrate that touch, at the very least, refines and, at the very best, disambiguates visual estimates during in-hand manipulation.
arXiv Detail & Related papers (2023-12-20T22:36:37Z) - ScanERU: Interactive 3D Visual Grounding based on Embodied Reference
Understanding [67.21613160846299]
Embodied Reference Understanding (ERU) is first designed for this concern.
New dataset called ScanERU is constructed to evaluate the effectiveness of this idea.
arXiv Detail & Related papers (2023-03-23T11:36:14Z) - See, Hear, and Feel: Smart Sensory Fusion for Robotic Manipulation [49.925499720323806]
We study how visual, auditory, and tactile perception can jointly help robots to solve complex manipulation tasks.
We build a robot system that can see with a camera, hear with a contact microphone, and feel with a vision-based tactile sensor.
arXiv Detail & Related papers (2022-12-07T18:55:53Z) - Understanding top-down attention using task-oriented ablation design [0.22940141855172028]
Top-down attention allows neural networks, both artificial and biological, to focus on the information most relevant for a given task.
We aim to answer this with a computational experiment based on a general framework called task-oriented ablation design.
We compare the performance of two neural networks, one with top-down attention and one without.
arXiv Detail & Related papers (2021-06-08T21:01:47Z) - What Can You Learn from Your Muscles? Learning Visual Representation
from Human Interactions [50.435861435121915]
We use human interaction and attention cues to investigate whether we can learn better representations compared to visual-only representations.
Our experiments show that our "muscly-supervised" representation outperforms a visual-only state-of-the-art method MoCo.
arXiv Detail & Related papers (2020-10-16T17:46:53Z) - View-invariant action recognition [3.553493344868414]
The varying pattern of a lot-temporal appearance generated by human action is key for identifying action performed.
The research in view-invariant action recognition addresses this problem on recognizing human actions from unseen viewpoints.
arXiv Detail & Related papers (2020-09-01T18:08:46Z) - VisualEchoes: Spatial Image Representation Learning through Echolocation [97.23789910400387]
Several animal species (e.g., bats, dolphins, and whales) and even visually impaired humans have the remarkable ability to perform echolocation.
We propose a novel interaction-based representation learning framework that learns useful visual features via echolocation.
Our work opens a new path for representation learning for embodied agents, where supervision comes from interacting with the physical world.
arXiv Detail & Related papers (2020-05-04T16:16:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.