Learning to Abstract and Predict Human Actions
- URL: http://arxiv.org/abs/2008.09234v1
- Date: Thu, 20 Aug 2020 23:57:58 GMT
- Title: Learning to Abstract and Predict Human Actions
- Authors: Romero Morais, Vuong Le, Truyen Tran, Svetha Venkatesh
- Abstract summary: We model the hierarchical structure of human activities in videos and demonstrate the power of such structure in action prediction.
We propose Hierarchical-Refresher-Anticipator, a multi-level neural machine that can learn the structure of human activities by observing a partial hierarchy of events and roll-out such structure into a future prediction in multiple levels of abstraction.
- Score: 60.85905430007731
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human activities are naturally structured as hierarchies unrolled over time.
For action prediction, temporal relations in event sequences are widely
exploited by current methods while their semantic coherence across different
levels of abstraction has not been well explored. In this work we model the
hierarchical structure of human activities in videos and demonstrate the power
of such structure in action prediction. We propose Hierarchical
Encoder-Refresher-Anticipator, a multi-level neural machine that can learn the
structure of human activities by observing a partial hierarchy of events and
roll-out such structure into a future prediction in multiple levels of
abstraction. We also introduce a new coarse-to-fine action annotation on the
Breakfast Actions videos to create a comprehensive, consistent, and cleanly
structured video hierarchical activity dataset. Through our experiments, we
examine and rethink the settings and metrics of activity prediction tasks
toward unbiased evaluation of prediction systems, and demonstrate the role of
hierarchical modeling toward reliable and detailed long-term action
forecasting.
Related papers
- A Multi-Branched Radial Basis Network Approach to Predicting Complex Chaotic Behaviours [0.0]
We propose a multi branched network approach to predict the dynamics of a physics attractor characterized by intricate and chaotic behavior.
Our results demonstrate successful prediction of the attractor's trajectory across 100 predictions made using a real-world dataset of 36,700 time-series observations.
arXiv Detail & Related papers (2024-03-31T09:10:32Z) - Skeleton2vec: A Self-supervised Learning Framework with Contextualized
Target Representations for Skeleton Sequence [56.092059713922744]
We show that using high-level contextualized features as prediction targets can achieve superior performance.
Specifically, we propose Skeleton2vec, a simple and efficient self-supervised 3D action representation learning framework.
Our proposed Skeleton2vec outperforms previous methods and achieves state-of-the-art results.
arXiv Detail & Related papers (2024-01-01T12:08:35Z) - Motion-Scenario Decoupling for Rat-Aware Video Position Prediction:
Strategy and Benchmark [49.58762201363483]
We introduce RatPose, a bio-robot motion prediction dataset constructed by considering the influence factors of individuals and environments.
We propose a Dual-stream Motion-Scenario Decoupling framework that effectively separates scenario-oriented and motion-oriented features.
We demonstrate significant performance improvements of the proposed textitDMSD framework on different difficulty-level tasks.
arXiv Detail & Related papers (2023-05-17T14:14:31Z) - Towards Out-of-Distribution Sequential Event Prediction: A Causal
Treatment [72.50906475214457]
The goal of sequential event prediction is to estimate the next event based on a sequence of historical events.
In practice, the next-event prediction models are trained with sequential data collected at one time.
We propose a framework with hierarchical branching structures for learning context-specific representations.
arXiv Detail & Related papers (2022-10-24T07:54:13Z) - Hierarchically Self-Supervised Transformer for Human Skeleton
Representation Learning [45.13060970066485]
We propose a self-supervised hierarchical pre-training scheme incorporated into a hierarchical Transformer-based skeleton sequence encoder (Hi-TRS)
Under both supervised and semi-supervised evaluation protocols, our method achieves the state-of-the-art performance.
arXiv Detail & Related papers (2022-07-20T04:21:05Z) - Developing hierarchical anticipations via neural network-based event
segmentation [14.059479351946386]
We model the development of hierarchical predictions via autonomously learned latent event codes.
We present a hierarchical recurrent neural network architecture, whose inductive learning biases foster the development of sparsely changing latent state.
A higher level network learns to predict the situations in which the latent states tend to change.
arXiv Detail & Related papers (2022-06-04T18:54:31Z) - Self-Supervision by Prediction for Object Discovery in Videos [62.87145010885044]
In this paper, we use the prediction task as self-supervision and build a novel object-centric model for image sequence representation.
Our framework can be trained without the help of any manual annotation or pretrained network.
Initial experiments confirm that the proposed pipeline is a promising step towards object-centric video prediction.
arXiv Detail & Related papers (2021-03-09T19:14:33Z) - Machine-Generated Hierarchical Structure of Human Activities to Reveal
How Machines Think [0.0]
We argue the importance and feasibility of constructing a hierarchical labeling system for human activity recognition.
We utilize the predictions of a black box HAR model to identify similarities between different activities.
In this system, the activity labels on the same level will have a designed magnitude of accuracy and reflect a specific amount of activity details.
arXiv Detail & Related papers (2021-01-19T20:40:22Z) - Learning intuitive physics and one-shot imitation using
state-action-prediction self-organizing maps [0.0]
Humans learn by exploration and imitation, build causal models of the world, and use both to flexibly solve new tasks.
We suggest a simple but effective unsupervised model which develops such characteristics.
We demonstrate its performance on a set of several related, but different one-shot imitation tasks, which the agent flexibly solves in an active inference style.
arXiv Detail & Related papers (2020-07-03T12:29:11Z) - Inferring Temporal Compositions of Actions Using Probabilistic Automata [61.09176771931052]
We propose to express temporal compositions of actions as semantic regular expressions and derive an inference framework using probabilistic automata.
Our approach is different from existing works that either predict long-range complex activities as unordered sets of atomic actions, or retrieve videos using natural language sentences.
arXiv Detail & Related papers (2020-04-28T00:15:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.