Action Recognition with Deep Multiple Aggregation Networks
- URL: http://arxiv.org/abs/2006.04489v1
- Date: Mon, 8 Jun 2020 11:37:38 GMT
- Title: Action Recognition with Deep Multiple Aggregation Networks
- Authors: Ahmed Mazari and Hichem Sahbi
- Abstract summary: We introduce a novel hierarchical pooling design that captures different levels of temporal granularity in action recognition.
Our design principle is coarse-to-fine and achieved using a tree-structured network.
Besides being principled and well grounded, the proposed hierarchical pooling is also video-length and resolution agnostic.
- Score: 14.696233190562936
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most of the current action recognition algorithms are based on deep networks
which stack multiple convolutional, pooling and fully connected layers. While
convolutional and fully connected operations have been widely studied in the
literature, the design of pooling operations that handle action recognition,
with different sources of temporal granularity in action categories, has
comparatively received less attention, and existing solutions rely mainly on
max or averaging operations. The latter are clearly powerless to fully exhibit
the actual temporal granularity of action categories and thereby constitute a
bottleneck in classification performances. In this paper, we introduce a novel
hierarchical pooling design that captures different levels of temporal
granularity in action recognition. Our design principle is coarse-to-fine and
achieved using a tree-structured network; as we traverse this network top-down,
pooling operations are getting less invariant but timely more resolute and well
localized. Learning the combination of operations in this network -- which best
fits a given ground-truth -- is obtained by solving a constrained minimization
problem whose solution corresponds to the distribution of weights that capture
the contribution of each level (and thereby temporal granularity) in the global
hierarchical pooling process. Besides being principled and well grounded, the
proposed hierarchical pooling is also video-length and resolution agnostic.
Extensive experiments conducted on the challenging UCF-101, HMDB-51 and
JHMDB-21 databases corroborate all these statements.
Related papers
- Deconstruct Complexity (DeComplex): A Novel Perspective on Tackling Dense Action Detection [23.100602876056165]
We introduce a novel perspective inspired by how humans tackle complex tasks by breaking them into manageable sub-tasks.
Instead of relying on a single network to address the entire problem, we propose decomposing the problem into detecting key concepts present in action classes.
Our experiments demonstrate the superiority of our approach over state-of-the-art methods.
arXiv Detail & Related papers (2025-01-30T17:20:42Z) - Joint Learning for Scattered Point Cloud Understanding with Hierarchical Self-Distillation [34.26170741722835]
We propose an end-to-end architecture that compensates for and identifies partial point clouds on the fly.
hierarchical self-distillation (HSD) can be applied to arbitrary hierarchy-based point cloud methods.
arXiv Detail & Related papers (2023-12-28T08:51:04Z) - A Multi-objective Complex Network Pruning Framework Based on
Divide-and-conquer and Global Performance Impairment Ranking [40.59001171151929]
A multi-objective complex network pruning framework based on divide-and-conquer and global performance impairment ranking is proposed in this paper.
The proposed algorithm achieves a comparable performance with the state-of-the-art pruning methods.
arXiv Detail & Related papers (2023-03-28T12:05:15Z) - SIRe-Networks: Skip Connections over Interlaced Multi-Task Learning and
Residual Connections for Structure Preserving Object Classification [28.02302915971059]
In this paper, we introduce an interlaced multi-task learning strategy, defined SIRe, to reduce the vanishing gradient in relation to the object classification task.
The presented methodology directly improves a convolutional neural network (CNN) by enforcing the input image structure preservation through auto-encoders.
To validate the presented methodology, a simple CNN and various implementations of famous networks are extended via the SIRe strategy and extensively tested on the CIFAR100 dataset.
arXiv Detail & Related papers (2021-10-06T13:54:49Z) - Locality Matters: A Scalable Value Decomposition Approach for
Cooperative Multi-Agent Reinforcement Learning [52.7873574425376]
Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents.
We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Training Decentralized Execution paradigm.
arXiv Detail & Related papers (2021-09-22T10:08:15Z) - Unsupervised Domain-adaptive Hash for Networks [81.49184987430333]
Domain-adaptive hash learning has enjoyed considerable success in the computer vision community.
We develop an unsupervised domain-adaptive hash learning method for networks, dubbed UDAH.
arXiv Detail & Related papers (2021-08-20T12:09:38Z) - Joint Learning of Neural Transfer and Architecture Adaptation for Image
Recognition [77.95361323613147]
Current state-of-the-art visual recognition systems rely on pretraining a neural network on a large-scale dataset and finetuning the network weights on a smaller dataset.
In this work, we prove that dynamically adapting network architectures tailored for each domain task along with weight finetuning benefits in both efficiency and effectiveness.
Our method can be easily generalized to an unsupervised paradigm by replacing supernet training with self-supervised learning in the source domain tasks and performing linear evaluation in the downstream tasks.
arXiv Detail & Related papers (2021-03-31T08:15:17Z) - All at Once Network Quantization via Collaborative Knowledge Transfer [56.95849086170461]
We develop a novel collaborative knowledge transfer approach for efficiently training the all-at-once quantization network.
Specifically, we propose an adaptive selection strategy to choose a high-precision enquoteteacher for transferring knowledge to the low-precision student.
To effectively transfer knowledge, we develop a dynamic block swapping method by randomly replacing the blocks in the lower-precision student network with the corresponding blocks in the higher-precision teacher network.
arXiv Detail & Related papers (2021-03-02T03:09:03Z) - Recursive Multi-model Complementary Deep Fusion forRobust Salient Object
Detection via Parallel Sub Networks [62.26677215668959]
Fully convolutional networks have shown outstanding performance in the salient object detection (SOD) field.
This paper proposes a wider'' network architecture which consists of parallel sub networks with totally different network architectures.
Experiments on several famous benchmarks clearly demonstrate the superior performance, good generalization, and powerful learning ability of the proposed wider framework.
arXiv Detail & Related papers (2020-08-07T10:39:11Z) - Deep hierarchical pooling design for cross-granularity action
recognition [14.696233190562936]
We introduce a novel hierarchical aggregation design that captures different levels of temporal granularity in action recognition.
Learning the combination of operations in this network -- which best fits a given ground-truth -- is obtained by solving a constrained minimization problem.
Besides being principled and well grounded, the proposed hierarchical pooling is also video-length and resilient to misalignments in actions.
arXiv Detail & Related papers (2020-06-08T11:03:54Z) - Dense Residual Network: Enhancing Global Dense Feature Flow for
Character Recognition [75.4027660840568]
This paper explores how to enhance the local and global dense feature flow by exploiting hierarchical features fully from all the convolution layers.
Technically, we propose an efficient and effective CNN framework, i.e., Fast Dense Residual Network (FDRN) for text recognition.
arXiv Detail & Related papers (2020-01-23T06:55:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.