HAKE: A Knowledge Engine Foundation for Human Activity Understanding
- URL: http://arxiv.org/abs/2202.06851v2
- Date: Fri, 15 Sep 2023 08:00:19 GMT
- Title: HAKE: A Knowledge Engine Foundation for Human Activity Understanding
- Authors: Yong-Lu Li, Xinpeng Liu, Xiaoqian Wu, Yizhuo Li, Zuoyu Qiu, Liang Xu,
Yue Xu, Hao-Shu Fang, Cewu Lu
- Abstract summary: Human activity understanding is of widespread interest in artificial intelligence and spans diverse applications like health care and behavior analysis.
We propose a novel paradigm to reformulate this task in two stages: first mapping pixels to an intermediate space spanned by atomic activity primitives, then programming detected primitives with interpretable logic rules to infer semantics.
Our framework, the Human Activity Knowledge Engine (HAKE), exhibits superior generalization ability and performance upon challenging benchmarks.
- Score: 65.24064718649046
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human activity understanding is of widespread interest in artificial
intelligence and spans diverse applications like health care and behavior
analysis. Although there have been advances in deep learning, it remains
challenging. The object recognition-like solutions usually try to map pixels to
semantics directly, but activity patterns are much different from object
patterns, thus hindering success. In this work, we propose a novel paradigm to
reformulate this task in two stages: first mapping pixels to an intermediate
space spanned by atomic activity primitives, then programming detected
primitives with interpretable logic rules to infer semantics. To afford a
representative primitive space, we build a knowledge base including 26+ M
primitive labels and logic rules from human priors or automatic discovering.
Our framework, the Human Activity Knowledge Engine (HAKE), exhibits superior
generalization ability and performance upon canonical methods on challenging
benchmarks. Code and data are available at http://hake-mvig.cn/.
Related papers
- Adaptive Language-Guided Abstraction from Contrastive Explanations [53.48583372522492]
It is necessary to determine which features of the environment are relevant before determining how these features should be used to compute reward.
End-to-end methods for joint feature and reward learning often yield brittle reward functions that are sensitive to spurious state features.
This paper describes a method named ALGAE which alternates between using language models to iteratively identify human-meaningful features.
arXiv Detail & Related papers (2024-09-12T16:51:58Z) - How To Not Train Your Dragon: Training-free Embodied Object Goal
Navigation with Semantic Frontiers [94.46825166907831]
We present a training-free solution to tackle the object goal navigation problem in Embodied AI.
Our method builds a structured scene representation based on the classic visual simultaneous localization and mapping (V-SLAM) framework.
Our method propagates semantics on the scene graphs based on language priors and scene statistics to introduce semantic knowledge to the geometric frontiers.
arXiv Detail & Related papers (2023-05-26T13:38:33Z) - CHARM: A Hierarchical Deep Learning Model for Classification of Complex
Human Activities Using Motion Sensors [0.9594432031144714]
CHARM is a hierarchical deep learning model for classification of complex human activities using motion sensors.
It outperforms state-of-the-art supervised learning approaches for high-level activity recognition in terms of average accuracy and F1 scores.
The ability to learn low-level user activities when trained using only high-level activity labels may pave the way to semi-supervised learning of HAR tasks.
arXiv Detail & Related papers (2022-07-16T01:36:54Z) - Combining Learning from Human Feedback and Knowledge Engineering to
Solve Hierarchical Tasks in Minecraft [1.858151490268935]
We present the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge: Learning from Human Feedback in Minecraft.
Our approach uses the available human demonstration data to train an imitation learning policy for navigation.
We compare this hybrid intelligence approach to both end-to-end machine learning and pure engineered solutions, which are then judged by human evaluators.
arXiv Detail & Related papers (2021-12-07T04:12:23Z) - WenLan 2.0: Make AI Imagine via a Multimodal Foundation Model [74.4875156387271]
We develop a novel foundation model pre-trained with huge multimodal (visual and textual) data.
We show that state-of-the-art results can be obtained on a wide range of downstream tasks.
arXiv Detail & Related papers (2021-10-27T12:25:21Z) - Augmenting Reinforcement Learning with Behavior Primitives for Diverse
Manipulation Tasks [17.13584584844048]
This work introduces MAnipulation Primitive-augmented reinforcement LEarning (MAPLE), a learning framework that augments standard reinforcement learning algorithms with a pre-defined library of behavior primitives.
We develop a hierarchical policy that involves the primitives and instantiates their executions with input parameters.
We demonstrate that MAPLE outperforms baseline approaches by a significant margin on a suite of simulated manipulation tasks.
arXiv Detail & Related papers (2021-10-07T17:44:33Z) - Simultaneous Multi-View Object Recognition and Grasping in Open-Ended
Domains [0.0]
We propose a deep learning architecture with augmented memory capacities to handle open-ended object recognition and grasping simultaneously.
We demonstrate the ability of our approach to grasp never-seen-before objects and to rapidly learn new object categories using very few examples on-site in both simulation and real-world settings.
arXiv Detail & Related papers (2021-06-03T14:12:11Z) - Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges [50.22269760171131]
The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods.
This text is concerned with exposing pre-defined regularities through unified geometric principles.
It provides a common mathematical framework to study the most successful neural network architectures, such as CNNs, RNNs, GNNs, and Transformers.
arXiv Detail & Related papers (2021-04-27T21:09:51Z) - Fast Concept Mapping: The Emergence of Human Abilities in Artificial
Neural Networks when Learning Embodied and Self-Supervised [0.0]
We introduce a setup in which an artificial agent first learns in a simulated world through self-supervised exploration.
We use a method we call fast concept mapping which uses correlated firing patterns of neurons to define and detect semantic concepts.
arXiv Detail & Related papers (2021-02-03T17:19:49Z) - Bongard-LOGO: A New Benchmark for Human-Level Concept Learning and
Reasoning [78.13740873213223]
Bongard problems (BPs) were introduced as an inspirational challenge for visual cognition in intelligent systems.
We propose a new benchmark Bongard-LOGO for human-level concept learning and reasoning.
arXiv Detail & Related papers (2020-10-02T03:19:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.