Online Action Recognition
- URL: http://arxiv.org/abs/2012.07464v1
- Date: Mon, 14 Dec 2020 12:37:20 GMT
- Title: Online Action Recognition
- Authors: Alejandro Su\'arez-Hern\'andez and Javier Segovia-Aguas and Carme
Torras and Guillem Aleny\`a
- Abstract summary: Action Unification (AU) and Online Action Recognition through Unification (OARU) are proposed.
AU builds on logic unification and generalizes two input actions using weighted partial MaxSAT.
OARU recognizes actions accurately with respect to expert knowledge, and shows real-time performance.
- Score: 69.32402131983699
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recognition in planning seeks to find agent intentions, goals or activities
given a set of observations and a knowledge library (e.g. goal states, plans or
domain theories). In this work we introduce the problem of Online Action
Recognition. It consists in recognizing, in an open world, the planning action
that best explains a partially observable state transition from a knowledge
library of first-order STRIPS actions, which is initially empty. We frame this
as an optimization problem, and propose two algorithms to address it: Action
Unification (AU) and Online Action Recognition through Unification (OARU). The
former builds on logic unification and generalizes two input actions using
weighted partial MaxSAT. The latter looks for an action within the library that
explains an observed transition. If there is such action, it generalizes it
making use of AU, building in this way an AU hierarchy. Otherwise, OARU inserts
a Trivial Grounded Action (TGA) in the library that explains just that
transition. We report results on benchmarks from the International Planning
Competition and PDDLGym, where OARU recognizes actions accurately with respect
to expert knowledge, and shows real-time performance.
Related papers
- FinePseudo: Improving Pseudo-Labelling through Temporal-Alignablity for Semi-Supervised Fine-Grained Action Recognition [57.17966905865054]
Real-life applications of action recognition often require a fine-grained understanding of subtle movements.
Existing semi-supervised action recognition has mainly focused on coarse-grained action recognition.
We propose an Alignability-Verification-based Metric learning technique to effectively discriminate between fine-grained action pairs.
arXiv Detail & Related papers (2024-09-02T20:08:06Z) - ActionSwitch: Class-agnostic Detection of Simultaneous Actions in Streaming Videos [35.371453530275666]
ActionSwitch is the first class-agnostic On-TAL framework capable of detecting overlapping actions.
By obviating the reliance on class information, ActionSwitch provides wider applicability to various situations.
arXiv Detail & Related papers (2024-07-17T20:07:05Z) - Continual Generalized Intent Discovery: Marching Towards Dynamic and
Open-world Intent Recognition [25.811639218862958]
Generalized Intent Discovery (GID) only considers one stage of OOD learning, and needs to utilize the data in all previous stages for joint training.
Continual Generalized Intent Discovery (CGID) aims to continuously and automatically discover OOD intents from dynamic OOD data streams.
PLRD bootstraps new intent discovery through class prototypes and balances new and old intents through data replay and feature distillation.
arXiv Detail & Related papers (2023-10-16T08:48:07Z) - Action Sensitivity Learning for Temporal Action Localization [35.65086250175736]
We propose an Action Sensitivity Learning framework (ASL) to tackle the task of temporal action localization.
We first introduce a lightweight Action Sensitivity Evaluator to learn the action sensitivity at the class level and instance level, respectively.
Based on the action sensitivity of each frame, we design an Action Sensitive Contrastive Loss to enhance features, where the action-aware frames are sampled as positive pairs to push away the action-irrelevant frames.
arXiv Detail & Related papers (2023-05-25T04:19:14Z) - DOAD: Decoupled One Stage Action Detection Network [77.14883592642782]
Localizing people and recognizing their actions from videos is a challenging task towards high-level video understanding.
Existing methods are mostly two-stage based, with one stage for person bounding box generation and the other stage for action recognition.
We present a decoupled one-stage network dubbed DOAD, to improve the efficiency for-temporal action detection.
arXiv Detail & Related papers (2023-04-01T08:06:43Z) - Actor-identified Spatiotemporal Action Detection -- Detecting Who Is
Doing What in Videos [29.5205455437899]
Temporal Action Detection (TAD) has been investigated for estimating the start and end time for each action in videos.
Spatiotemporal Action Detection (SAD) has been studied for localizing the action both spatially and temporally in videos.
We propose a novel task, Actor-identified Spatiotemporal Action Detection (ASAD) to bridge the gap between SAD actor identification.
arXiv Detail & Related papers (2022-08-27T06:51:12Z) - Graph Convolutional Module for Temporal Action Localization in Videos [142.5947904572949]
We claim that the relations between action units play an important role in action localization.
A more powerful action detector should not only capture the local content of each action unit but also allow a wider field of view on the context related to it.
We propose a general graph convolutional module (GCM) that can be easily plugged into existing action localization methods.
arXiv Detail & Related papers (2021-12-01T06:36:59Z) - Weakly Supervised Temporal Action Localization Through Learning Explicit
Subspaces for Action and Context [151.23835595907596]
Methods learn to localize temporal starts and ends of action instances in a video under only video-level supervision.
We introduce a framework that learns two feature subspaces respectively for actions and their context.
The proposed approach outperforms state-of-the-art WS-TAL methods on three benchmarks.
arXiv Detail & Related papers (2021-03-30T08:26:53Z) - FineGym: A Hierarchical Video Dataset for Fine-grained Action
Understanding [118.32912239230272]
FineGym is a new action recognition dataset built on top of gymnastic videos.
It provides temporal annotations at both action and sub-action levels with a three-level semantic hierarchy.
This new level of granularity presents significant challenges for action recognition.
arXiv Detail & Related papers (2020-04-14T17:55:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.