Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike
Common Sense
- URL: http://arxiv.org/abs/2004.09044v1
- Date: Mon, 20 Apr 2020 04:07:28 GMT
- Title: Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike
Common Sense
- Authors: Yixin Zhu, Tao Gao, Lifeng Fan, Siyuan Huang, Mark Edmonds, Hangxin
Liu, Feng Gao, Chi Zhang, Siyuan Qi, Ying Nian Wu, Joshua B. Tenenbaum,
Song-Chun Zhu
- Abstract summary: We argue that the next generation of AI must embrace "dark" humanlike common sense for solving novel tasks.
We identify functionality, physics, intent, causality, and utility (FPICU) as the five core domains of cognitive AI with humanlike common sense.
- Score: 142.53911271465344
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent progress in deep learning is essentially based on a "big data for
small tasks" paradigm, under which massive amounts of data are used to train a
classifier for a single narrow task. In this paper, we call for a shift that
flips this paradigm upside down. Specifically, we propose a "small data for big
tasks" paradigm, wherein a single artificial intelligence (AI) system is
challenged to develop "common sense", enabling it to solve a wide range of
tasks with little training data. We illustrate the potential power of this new
paradigm by reviewing models of common sense that synthesize recent
breakthroughs in both machine and human vision. We identify functionality,
physics, intent, causality, and utility (FPICU) as the five core domains of
cognitive AI with humanlike common sense. When taken as a unified concept,
FPICU is concerned with the questions of "why" and "how", beyond the dominant
"what" and "where" framework for understanding vision. They are invisible in
terms of pixels but nevertheless drive the creation, maintenance, and
development of visual scenes. We therefore coin them the "dark matter" of
vision. Just as our universe cannot be understood by merely studying observable
matter, we argue that vision cannot be understood without studying FPICU. We
demonstrate the power of this perspective to develop cognitive AI systems with
humanlike common sense by showing how to observe and apply FPICU with little
training data to solve a wide range of challenging tasks, including tool use,
planning, utility inference, and social learning. In summary, we argue that the
next generation of AI must embrace "dark" humanlike common sense for solving
novel tasks.
Related papers
- Vision Language Models See What You Want but not What You See [9.268588981925234]
Knowing others' intentions and taking others' perspectives are two core components of human intelligence.
In this paper, we investigate intentionality understanding and perspective-taking in Vision Language Models.
Surprisingly, we find VLMs achieving high performance on intentionality understanding but lower performance on perspective-taking.
arXiv Detail & Related papers (2024-10-01T01:52:01Z) - Artificial General Intelligence (AGI)-Native Wireless Systems: A Journey Beyond 6G [58.440115433585824]
Building future wireless systems that support services like digital twins (DTs) is challenging to achieve through advances to conventional technologies like meta-surfaces.
While artificial intelligence (AI)-native networks promise to overcome some limitations of wireless technologies, developments still rely on AI tools like neural networks.
This paper revisits the concept of AI-native wireless systems, equipping them with the common sense necessary to transform them into artificial general intelligence (AGI)-native systems.
arXiv Detail & Related papers (2024-04-29T04:51:05Z) - Visual Knowledge in the Big Model Era: Retrospect and Prospect [63.282425615863]
Visual knowledge is a new form of knowledge representation that can encapsulate visual concepts and their relations in a succinct, comprehensive, and interpretable manner.
As the knowledge about the visual world has been identified as an indispensable component of human cognition and intelligence, visual knowledge is poised to have a pivotal role in establishing machine intelligence.
arXiv Detail & Related papers (2024-04-05T07:31:24Z) - Human-oriented Representation Learning for Robotic Manipulation [64.59499047836637]
Humans inherently possess generalizable visual representations that empower them to efficiently explore and interact with the environments in manipulation tasks.
We formalize this idea through the lens of human-oriented multi-task fine-tuning on top of pre-trained visual encoders.
Our Task Fusion Decoder consistently improves the representation of three state-of-the-art visual encoders for downstream manipulation policy-learning.
arXiv Detail & Related papers (2023-10-04T17:59:38Z) - A Review on Objective-Driven Artificial Intelligence [0.0]
Humans have an innate ability to understand context, nuances, and subtle cues in communication.
Humans possess a vast repository of common-sense knowledge that helps us make logical inferences and predictions about the world.
Machines lack this innate understanding and often struggle with making sense of situations that humans find trivial.
arXiv Detail & Related papers (2023-08-20T02:07:42Z) - Reflective Artificial Intelligence [2.7412662946127755]
Many important qualities that a human mind would have previously brought to the activity are utterly absent in AI.
One core feature that humans bring to tasks is reflection.
Yet this capability is utterly missing from current mainstream AI.
In this paper we ask what reflective AI might look like.
arXiv Detail & Related papers (2023-01-25T20:50:26Z) - Learning Perceptual Concepts by Bootstrapping from Human Queries [41.07749131023931]
We propose a new approach whereby the robot learns a low-dimensional variant of the concept and uses it to generate a larger data set for learning the concept in the high-dimensional space.
This lets it take advantage of semantically meaningful privileged information only accessible at training time, like object poses and bounding boxes, that allows for richer human interaction to speed up learning.
arXiv Detail & Related papers (2021-11-09T16:43:46Z) - WenLan 2.0: Make AI Imagine via a Multimodal Foundation Model [74.4875156387271]
We develop a novel foundation model pre-trained with huge multimodal (visual and textual) data.
We show that state-of-the-art results can be obtained on a wide range of downstream tasks.
arXiv Detail & Related papers (2021-10-27T12:25:21Z) - Bongard-LOGO: A New Benchmark for Human-Level Concept Learning and
Reasoning [78.13740873213223]
Bongard problems (BPs) were introduced as an inspirational challenge for visual cognition in intelligent systems.
We propose a new benchmark Bongard-LOGO for human-level concept learning and reasoning.
arXiv Detail & Related papers (2020-10-02T03:19:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.