Related papers: ESPRIT: Explaining Solutions to Physical Reasoning Tasks

ESPRIT: Explaining Solutions to Physical Reasoning Tasks

URL: http://arxiv.org/abs/2005.00730v2
Date: Thu, 14 May 2020 00:24:13 GMT
Title: ESPRIT: Explaining Solutions to Physical Reasoning Tasks
Authors: Nazneen Fatema Rajani, Rui Zhang, Yi Chern Tan, Stephan Zheng, Jeremy Weiss, Aadit Vyas, Abhijit Gupta, Caiming XIong, Richard Socher, Dragomir Radev
Abstract summary: ESPRIT is a framework for commonsense reasoning about qualitative physics in natural language. Our framework learns to generate explanations of how the physical simulation will causally evolve so that an agent or a human can easily reason about a solution. Human evaluations indicate that ESPRIT produces crucial fine-grained details and has high coverage of physical concepts compared to even human annotations.
Score: 106.77019206219984
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural networks lack the ability to reason about qualitative physics and so cannot generalize to scenarios and tasks unseen during training. We propose ESPRIT, a framework for commonsense reasoning about qualitative physics in natural language that generates interpretable descriptions of physical events. We use a two-step approach of first identifying the pivotal physical events in an environment and then generating natural language descriptions of those events using a data-to-text approach. Our framework learns to generate explanations of how the physical simulation will causally evolve so that an agent or a human can easily reason about a solution using those interpretable descriptions. Human evaluations indicate that ESPRIT produces crucial fine-grained details and has high coverage of physical concepts compared to even human annotations. Dataset, code and documentation are available at https://github.com/salesforce/esprit.

Related papers

Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning [76.94237859217469]
Physical AI systems need to perceive, understand, and perform complex actions in the physical world. We present models that can understand the physical world generate appropriate embodied decisions. We use a hierarchical ontology that captures fundamental knowledge about space, time, and physics. For embodied reasoning, we rely on a two-dimensional ontology that generalizes across different physical embodiments.
arXiv Detail & Related papers (2025-03-18T22:06:58Z)
Neural Force Field: Learning Generalized Physical Representation from a Few Examples [24.651024239605288]
Current AI models, despite extensive training, still struggle to achieve similar generalization. We present Neural Force Field (NFF) a modeling framework built on Neural Ordinary Differential Equation (NODE) NFF captures fundamental physical concepts such as gravity, support, and collision in an interpretable manner.
arXiv Detail & Related papers (2025-02-13T05:50:13Z)
A Multi-Task Text Classification Pipeline with Natural Language Explanations: A User-Centric Evaluation in Sentiment Analysis and Offensive Language Identification in Greek Tweets [8.846643533783205]
This work introduces an early concept for a novel pipeline that can be used in text classification tasks. It comprises of two models: a classifier for labelling the text and an explanation generator which provides the explanation. Experiments are centred around the tasks of sentiment analysis and offensive language identification in Greek tweets.
arXiv Detail & Related papers (2024-10-14T08:41:31Z)
Compositional Physical Reasoning of Objects and Events from Videos [122.6862357340911]
This paper addresses the challenge of inferring hidden physical properties from objects' motion and interactions. We evaluate state-of-the-art video reasoning models on ComPhy and reveal their limited ability to capture these hidden properties. We also propose a novel neuro-symbolic framework, Physical Concept Reasoner (PCR), that learns and reasons about both visible and hidden physical properties.
arXiv Detail & Related papers (2024-08-02T15:19:55Z)
Physical Property Understanding from Language-Embedded Feature Fields [27.151380830258603]
We present a novel approach for dense prediction of the physical properties of objects using a collection of images. Inspired by how humans reason about physics through vision, we leverage large language models to propose candidate materials for each object. Our method is accurate, annotation-free, and applicable to any object in the open world.
arXiv Detail & Related papers (2024-04-05T17:45:07Z)
PhyGrasp: Generalizing Robotic Grasping with Physics-informed Large Multimodal Models [58.33913881592706]
Humans can easily apply their intuitive physics to grasp skillfully and change grasps efficiently, even for objects they have never seen before. This work delves into infusing such physical commonsense reasoning into robotic manipulation. We introduce PhyGrasp, a multimodal large model that leverages inputs from two modalities: natural language and 3D point clouds.
arXiv Detail & Related papers (2024-02-26T18:57:52Z)
Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks. The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation. We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z)
CLEVRER-Humans: Describing Physical and Causal Events the Human Way [55.44915246065028]
We present the CLEVRER-Humans benchmark, a video dataset for causal judgment of physical events with human labels. We employ two techniques to improve data collection efficiency: first, a novel iterative event cloze task to elicit a new representation of events in videos, which we term Causal Event Graphs (CEGs); second, a data augmentation technique based on neural language generative models.
arXiv Detail & Related papers (2023-10-05T16:09:48Z)
Imagination-Augmented Natural Language Understanding [71.51687221130925]
We introduce an Imagination-Augmented Cross-modal (iACE) to solve natural language understanding tasks. iACE enables visual imagination with external knowledge transferred from the powerful generative and pre-trained vision-and-language models. Experiments on GLUE and SWAG show that iACE achieves consistent improvement over visually-supervised pre-trained models.
arXiv Detail & Related papers (2022-04-18T19:39:36Z)
Physion: Evaluating Physical Prediction from Vision in Humans and Machines [46.19008633309041]
We present a visual and physical prediction benchmark that precisely measures this capability. We compare an array of algorithms on their ability to make diverse physical predictions. We find that graph neural networks with access to the physical state best capture human behavior.
arXiv Detail & Related papers (2021-06-15T16:13:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.