A Benchmark for Modeling Violation-of-Expectation in Physical Reasoning
Across Event Categories
- URL: http://arxiv.org/abs/2111.08826v1
- Date: Tue, 16 Nov 2021 22:59:25 GMT
- Title: A Benchmark for Modeling Violation-of-Expectation in Physical Reasoning
Across Event Categories
- Authors: Arijit Dasgupta, Jiafei Duan, Marcelo H. Ang Jr, Yi Lin, Su-hua Wang,
Ren\'ee Baillargeon, Cheston Tan
- Abstract summary: Violation-of-Expectation (VoE) is used to label scenes as either expected or surprising with knowledge of only expected scenes.
Existing VoE-based 3D datasets in physical reasoning provide mainly vision data with little to no-truths or inductive biases.
We set up a benchmark to study physical reasoning by curating a novel large-scale synthetic 3D VoE dataset armed with ground-truth labels of causally relevant features and rules.
- Score: 4.4920673251997885
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent work in computer vision and cognitive reasoning has given rise to an
increasing adoption of the Violation-of-Expectation (VoE) paradigm in synthetic
datasets. Inspired by infant psychology, researchers are now evaluating a
model's ability to label scenes as either expected or surprising with knowledge
of only expected scenes. However, existing VoE-based 3D datasets in physical
reasoning provide mainly vision data with little to no heuristics or inductive
biases. Cognitive models of physical reasoning reveal infants create high-level
abstract representations of objects and interactions. Capitalizing on this
knowledge, we established a benchmark to study physical reasoning by curating a
novel large-scale synthetic 3D VoE dataset armed with ground-truth heuristic
labels of causally relevant features and rules. To validate our dataset in five
event categories of physical reasoning, we benchmarked and analyzed human
performance. We also proposed the Object File Physical Reasoning Network
(OFPR-Net) which exploits the dataset's novel heuristics to outperform our
baseline and ablation models. The OFPR-Net is also flexible in learning an
alternate physical reality, showcasing its ability to learn universal causal
relationships in physical reasoning to create systems with better
interpretability.
Related papers
- Compositional Physical Reasoning of Objects and Events from Videos [122.6862357340911]
This paper addresses the challenge of inferring hidden physical properties from objects' motion and interactions.
We evaluate state-of-the-art video reasoning models on ComPhy and reveal their limited ability to capture these hidden properties.
We also propose a novel neuro-symbolic framework, Physical Concept Reasoner (PCR), that learns and reasons about both visible and hidden physical properties.
arXiv Detail & Related papers (2024-08-02T15:19:55Z) - Learning Discrete Concepts in Latent Hierarchical Models [73.01229236386148]
Learning concepts from natural high-dimensional data holds potential in building human-aligned and interpretable machine learning models.
We formalize concepts as discrete latent causal variables that are related via a hierarchical causal model.
We substantiate our theoretical claims with synthetic data experiments.
arXiv Detail & Related papers (2024-06-01T18:01:03Z) - ContPhy: Continuum Physical Concept Learning and Reasoning from Videos [86.63174804149216]
ContPhy is a novel benchmark for assessing machine physical commonsense.
We evaluated a range of AI models and found that they still struggle to achieve satisfactory performance on ContPhy.
We also introduce an oracle model (ContPRO) that marries the particle-based physical dynamic models with the recent large language models.
arXiv Detail & Related papers (2024-02-09T01:09:21Z) - Visual cognition in multimodal large language models [12.603212933816206]
Recent advancements have rekindled interest in the potential to emulate human-like cognitive abilities.
This paper evaluates the current state of vision-based large language models in the domains of intuitive physics, causal reasoning, and intuitive psychology.
arXiv Detail & Related papers (2023-11-27T18:58:34Z) - X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events [75.94926117990435]
This study introduces X-VoE, a benchmark dataset to assess AI agents' grasp of intuitive physics.
X-VoE establishes a higher bar for the explanatory capacities of intuitive physics models.
We present an explanation-based learning system that captures physics dynamics and infers occluded object states.
arXiv Detail & Related papers (2023-08-21T03:28:23Z) - InDL: A New Dataset and Benchmark for In-Diagram Logic Interpretation
based on Visual Illusion [1.7980584146314789]
This paper introduces a novel approach to evaluating deep learning models' capacity for in-diagram logic interpretation.
We establish a unique dataset, InDL, designed to rigorously test and benchmark these models.
We utilize six classic geometric optical illusions to create a comparative framework between human and machine visual perception.
arXiv Detail & Related papers (2023-05-28T13:01:32Z) - PTR: A Benchmark for Part-based Conceptual, Relational, and Physical
Reasoning [135.2892665079159]
We introduce a new large-scale diagnostic visual reasoning dataset named PTR.
PTR contains around 70k RGBD synthetic images with ground truth object and part level annotations.
We examine several state-of-the-art visual reasoning models on this dataset and observe that they still make many surprising mistakes.
arXiv Detail & Related papers (2021-12-09T18:59:34Z) - AVoE: A Synthetic 3D Dataset on Understanding Violation of Expectation
for Artificial Cognition [2.561649173827544]
Violation-of-Expectation (VoE) is used to evaluate models' ability to discriminate between expected and surprising scenes.
Existing VoE-based 3D datasets in physical reasoning only provide vision data.
We propose AVoE: a synthetic 3D VoE-based dataset that presents stimuli from multiple novel sub-categories for five event categories of physical reasoning.
arXiv Detail & Related papers (2021-10-12T08:59:19Z) - Physion: Evaluating Physical Prediction from Vision in Humans and
Machines [46.19008633309041]
We present a visual and physical prediction benchmark that precisely measures this capability.
We compare an array of algorithms on their ability to make diverse physical predictions.
We find that graph neural networks with access to the physical state best capture human behavior.
arXiv Detail & Related papers (2021-06-15T16:13:39Z) - Visual Grounding of Learned Physical Models [66.04898704928517]
Humans intuitively recognize objects' physical properties and predict their motion, even when the objects are engaged in complicated interactions.
We present a neural model that simultaneously reasons about physics and makes future predictions based on visual and dynamics priors.
Experiments show that our model can infer the physical properties within a few observations, which allows the model to quickly adapt to unseen scenarios and make accurate predictions into the future.
arXiv Detail & Related papers (2020-04-28T17:06:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.