Related papers: Remembering What Is Important: A Factorised Multi-Head Retrieval and Auxiliary Memory Stabilisation Scheme for Human Motion Prediction

Remembering What Is Important: A Factorised Multi-Head Retrieval and Auxiliary Memory Stabilisation Scheme for Human Motion Prediction

URL: http://arxiv.org/abs/2305.11394v1
Date: Fri, 19 May 2023 02:44:58 GMT
Title: Remembering What Is Important: A Factorised Multi-Head Retrieval and Auxiliary Memory Stabilisation Scheme for Human Motion Prediction
Authors: Tharindu Fernando and Harshala Gammulle and Sridha Sridharan and Simon Denman and Clinton Fookes
Abstract summary: This paper presents an innovative auxiliary-memory-powered deep neural network framework for the improved modelling of historical knowledge. We disentangle subject-specific, task-specific, and other auxiliary information from the observed pose sequences and utilise these factorised features to query the memory. Two novel loss functions are introduced to encourage diversity within the auxiliary memory while ensuring the stability of the memory contents.
Score: 41.34294145237618
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Humans exhibit complex motions that vary depending on the task that they are performing, the interactions they engage in, as well as subject-specific preferences. Therefore, forecasting future poses based on the history of the previous motions is a challenging task. This paper presents an innovative auxiliary-memory-powered deep neural network framework for the improved modelling of historical knowledge. Specifically, we disentangle subject-specific, task-specific, and other auxiliary information from the observed pose sequences and utilise these factorised features to query the memory. A novel Multi-Head knowledge retrieval scheme leverages these factorised feature embeddings to perform multiple querying operations over the historical observations captured within the auxiliary memory. Moreover, our proposed dynamic masking strategy makes this feature disentanglement process dynamic. Two novel loss functions are introduced to encourage diversity within the auxiliary memory while ensuring the stability of the memory contents, such that it can locate and store salient information that can aid the long-term prediction of future motion, irrespective of data imbalances or the diversity of the input data distribution. With extensive experiments conducted on two public benchmarks, Human3.6M and CMU-Mocap, we demonstrate that these design choices collectively allow the proposed approach to outperform the current state-of-the-art methods by significant margins: $>$ 17\% on the Human3.6M dataset and $>$ 9\% on the CMU-Mocap dataset.

Related papers

FindingDory: A Benchmark to Evaluate Memory in Embodied Agents [49.89792845476579]
We introduce a new benchmark for long-range embodied tasks in the Habitat simulator.<n>This benchmark evaluates memory-based capabilities across 60 tasks requiring sustained engagement and contextual awareness.
arXiv Detail & Related papers (2025-06-18T17:06:28Z)
ReReLRP - Remembering and Recognizing Tasks with LRP [9.317606100792846]
We present ReReLRP, a novel solution to catastrophic forgetting in deep neural networks. Our contribution provides increased privacy of existing replay-free methods while additionally offering built-in explainability. We validate our approach on a wide variety of datasets, demonstrating results comparable with a well-known replay-based method in selected scenarios.
arXiv Detail & Related papers (2025-02-15T13:03:59Z)
Remember and Recall: Associative-Memory-based Trajectory Prediction [25.349986959111757]
We propose the Fragmented-Memory-based Trajectory Prediction (FMTP) model, inspired by the remarkable learning capabilities of humans. The FMTP model employs discrete representations to enhance computational efficiency by reducing information redundancy. We develop an advanced reasoning engine based on language models to deeply learn the associative rules among these discrete representations.
arXiv Detail & Related papers (2024-10-03T04:32:21Z)
Unsupervised Representation Learning of Complex Time Series for Maneuverability State Identification in Smart Mobility [0.0]
In smart mobility, MTS plays a crucial role in providing temporal dynamics of behaviors such as maneuver patterns. In this work, we aim to address challenges associated with modeling MTS data collected from a vehicle using sensors. Our goal is to investigate the effectiveness of two distinct unsupervised representation learning approaches in identifying maneuvering states in smart mobility.
arXiv Detail & Related papers (2024-08-26T15:16:18Z)
DUEL: Duplicate Elimination on Active Memory for Self-Supervised Class-Imbalanced Learning [19.717868805172323]
We propose an active data filtering process during self-supervised pre-training in our novel framework, Duplicate Elimination (DUEL) This framework integrates an active memory inspired by human working memory and introduces distinctiveness information, which measures the diversity of the data in the memory. The DUEL policy, which replaces the most duplicated data with new samples, aims to enhance the distinctiveness information in the memory and thereby mitigate class imbalances.
arXiv Detail & Related papers (2024-02-14T06:09:36Z)
Dynamic Spatio-Temporal Summarization using Information Based Fusion [3.038642416291856]
We propose a dynamic-temporal data summarization technique that identifies informative features in key timesteps and fuses less informative ones. Unlike existing methods, our method retains both raw and summarized timesteps, ensuring a comprehensive view of information changes over time. We demonstrate the versatility of our technique across diverse datasets, encompassing particle-based flow simulations, security and surveillance applications, and biological cell interactions within the immune system.
arXiv Detail & Related papers (2023-10-02T20:21:43Z)
Estimating Conditional Mutual Information for Dynamic Feature Selection [14.706269510726356]
Dynamic feature selection is a promising paradigm to reduce feature acquisition costs and provide transparency into a model's predictions. Here, we take an information-theoretic perspective and prioritize features based on their mutual information with the response variable. Our method provides consistent gains over recent methods across a variety of datasets.
arXiv Detail & Related papers (2023-06-05T23:03:03Z)
Motion-Scenario Decoupling for Rat-Aware Video Position Prediction: Strategy and Benchmark [49.58762201363483]
We introduce RatPose, a bio-robot motion prediction dataset constructed by considering the influence factors of individuals and environments. We propose a Dual-stream Motion-Scenario Decoupling framework that effectively separates scenario-oriented and motion-oriented features. We demonstrate significant performance improvements of the proposed textitDMSD framework on different difficulty-level tasks.
arXiv Detail & Related papers (2023-05-17T14:14:31Z)
VFDS: Variational Foresight Dynamic Selection in Bayesian Neural Networks for Efficient Human Activity Recognition [81.29900407096977]
Variational Foresight Dynamic Selection (VFDS) learns a policy that selects the next feature subset to observe. We apply VFDS on the Human Activity Recognition (HAR) task where the performance-cost trade-off is critical in its practice.
arXiv Detail & Related papers (2022-03-31T22:52:43Z)
Self-Attention Neural Bag-of-Features [103.70855797025689]
We build on the recently introduced 2D-Attention and reformulate the attention learning methodology. We propose a joint feature-temporal attention mechanism that learns a joint 2D attention mask highlighting relevant information.
arXiv Detail & Related papers (2022-01-26T17:54:14Z)
SyMetric: Measuring the Quality of Learnt Hamiltonian Dynamics Inferred from Vision [73.26414295633846]
A recently proposed class of models attempts to learn latent dynamics from high-dimensional observations. Existing methods rely on image reconstruction quality, which does not always reflect the quality of the learnt latent dynamics. We develop a set of new measures, including a binary indicator of whether the underlying Hamiltonian dynamics have been faithfully captured.
arXiv Detail & Related papers (2021-11-10T23:26:58Z)
Temporal Memory Relation Network for Workflow Recognition from Surgical Video [53.20825496640025]
We propose a novel end-to-end temporal memory relation network (TMNet) for relating long-range and multi-scale temporal patterns. We have extensively validated our approach on two benchmark surgical video datasets.
arXiv Detail & Related papers (2021-03-30T13:20:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.