Can Language Models Understand Physical Concepts?
- URL: http://arxiv.org/abs/2305.14057v1
- Date: Tue, 23 May 2023 13:36:55 GMT
- Title: Can Language Models Understand Physical Concepts?
- Authors: Lei Li, Jingjing Xu, Qingxiu Dong, Ce Zheng, Qi Liu, Lingpeng Kong, Xu
Sun
- Abstract summary: Language models gradually become general-purpose interfaces in the interactive and embodied world.
It is not yet clear whether LMs can understand physical concepts in the human world.
- Score: 45.30953251294797
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Language models~(LMs) gradually become general-purpose interfaces in the
interactive and embodied world, where the understanding of physical concepts is
an essential prerequisite. However, it is not yet clear whether LMs can
understand physical concepts in the human world. To investigate this, we design
a benchmark VEC that covers the tasks of (i) Visual concepts, such as the shape
and material of objects, and (ii) Embodied Concepts, learned from the
interaction with the world such as the temperature of objects. Our zero
(few)-shot prompting results show that the understanding of certain visual
concepts emerges as scaling up LMs, but there are still basic concepts to which
the scaling law does not apply. For example, OPT-175B performs close to humans
with a zero-shot accuracy of 85\% on the material concept, yet behaves like
random guessing on the mass concept. Instead, vision-augmented LMs such as CLIP
and BLIP achieve a human-level understanding of embodied concepts. Analysis
indicates that the rich semantics in visual representation can serve as a
valuable source of embodied knowledge. Inspired by this, we propose a
distillation method to transfer embodied knowledge from VLMs to LMs, achieving
performance gain comparable with that by scaling up the parameters of LMs 134x.
Our dataset is available at \url{https://github.com/TobiasLee/VEC}
Related papers
- Concept Induction using LLMs: a user experiment for assessment [1.1982127665424676]
This study explores the potential of a Large Language Model (LLM) to generate high-level concepts that are meaningful as explanations for humans.
We compare the concepts generated by the LLM with two other methods: concepts generated by humans and the ECII concept induction system.
Our findings indicate that while human-generated explanations remain superior, concepts derived from GPT-4 are more comprehensible to humans compared to those generated by ECII.
arXiv Detail & Related papers (2024-04-18T03:22:02Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - Concept-Oriented Deep Learning with Large Language Models [0.4548998901594072]
Large Language Models (LLMs) have been successfully used in many natural-language tasks and applications including text generation and AI chatbots.
They also are a promising new technology for concept-oriented deep learning (CODL)
We discuss conceptual understanding in visual-language LLMs, the most important multimodal LLMs, and major uses of them for CODL including concept extraction from image, concept graph extraction from image, and concept learning.
arXiv Detail & Related papers (2023-06-29T16:47:11Z) - Embodied Concept Learner: Self-supervised Learning of Concepts and
Mapping through Instruction Following [101.55727845195969]
We propose Embodied Learner Concept (ECL) in an interactive 3D environment.
A robot agent can ground visual concepts, build semantic maps and plan actions to complete tasks.
ECL is fully transparent and step-by-step interpretable in long-term planning.
arXiv Detail & Related papers (2023-04-07T17:59:34Z) - Intrinsic Physical Concepts Discovery with Object-Centric Predictive
Models [86.25460882547581]
We introduce the PHYsical Concepts Inference NEtwork (PHYCINE), a system that infers physical concepts in different abstract levels without supervision.
We show that object representations containing the discovered physical concepts variables could help achieve better performance in causal reasoning tasks.
arXiv Detail & Related papers (2023-03-03T11:52:21Z) - On Binding Objects to Symbols: Learning Physical Concepts to Understand
Real from Fake [155.6741526791004]
We revisit the classic signal-to-symbol barrier in light of the remarkable ability of deep neural networks to generate synthetic data.
We characterize physical objects as abstract concepts and use the previous analysis to show that physical objects can be encoded by finite architectures.
We conclude that binding physical entities to digital identities is possible in finite time with finite resources.
arXiv Detail & Related papers (2022-07-25T17:21:59Z) - Dynamic Visual Reasoning by Learning Differentiable Physics Models from
Video and Language [92.7638697243969]
We propose a unified framework that can jointly learn visual concepts and infer physics models of objects from videos and language.
This is achieved by seamlessly integrating three components: a visual perception module, a concept learner, and a differentiable physics engine.
arXiv Detail & Related papers (2021-10-28T17:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.