Related papers: Indoor Group Activity Recognition using Multi-Layered HMMs

Indoor Group Activity Recognition using Multi-Layered HMMs

URL: http://arxiv.org/abs/2101.10857v1
Date: Sat, 23 Jan 2021 22:02:12 GMT
Title: Indoor Group Activity Recognition using Multi-Layered HMMs
Authors: Vinayak Elangovan
Abstract summary: Group Activities (GA) based on imagery data processing have significant applications in surveillance systems. We propose Ontology GAR with a proper inference model that is capable of identifying and classifying a sequence of events in group activities. A multi-layered Markov Model (HMM) is proposed to recognize different levels of abstract observations.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Discovery and recognition of Group Activities (GA) based on imagery data processing have significant applications in persistent surveillance systems, which play an important role in some Internet services. The process is involved with analysis of sequential imagery data with spatiotemporal associations. Discretion of video imagery requires a proper inference system capable of discriminating and differentiating cohesive observations and interlinking them to known ontologies. We propose an Ontology based GAR with a proper inference model that is capable of identifying and classifying a sequence of events in group activities. A multi-layered Hidden Markov Model (HMM) is proposed to recognize different levels of abstract GA. The multi-layered HMM consists of N layers of HMMs where each layer comprises of M number of HMMs running in parallel. The number of layers depends on the order of information to be extracted. At each layer, by matching and correlating attributes of detected group events, the model attempts to associate sensory observations to known ontology perceptions. This paper demonstrates and compares performance of three different implementation of HMM, namely, concatenated N-HMM, cascaded C-HMM and hybrid H-HMM for building effective multi-layered HMM.

Related papers

Just Noticeable Difference for Large Multimodal Models [70.41467229325345]
Just noticeable difference (JND) is the minimum change that the human visual system (HVS) can perceive.<n>We take an initial attempt and demonstrate that there exist significant visual blind spots in current LMMs.<n>Our research underscores the significance of LMM-JND as a unique perspective for studying LMMs.
arXiv Detail & Related papers (2025-07-01T07:06:32Z)
DPGIIL: Dirichlet Process-Deep Generative Model-Integrated Incremental Learning for Clustering in Transmissibility-based Online Structural Anomaly Detection [0.0]
This work proposes the Dirichlet process-deep generative model-integrated incremental learning (DPGIIL) for clustering. By introducing a DPMM prior to the latent space of DGMs, DPGIIL automatically captures dissimilarities in extracted latent representations, enabling both generative modeling and clustering. Two case studies show that the proposed method outperforms some state-of-the-art approaches in structural anomaly detection and clustering.
arXiv Detail & Related papers (2024-12-06T05:18:58Z)
HMIL: Hierarchical Multi-Instance Learning for Fine-Grained Whole Slide Image Classification [10.203984731917851]
Fine-grained classification of whole slide images (WSIs) is essential in precision oncology, enabling precise cancer diagnosis and personalized treatment strategies. While the multi-instance learning (MIL) paradigm alleviates the computational burden of WSIs, existing MIL methods often overlook hierarchical label correlations. We introduce a novel hierarchical multi-instance learning (HMIL) framework to overcome these limitations.
arXiv Detail & Related papers (2024-11-12T09:22:00Z)
Synthetic Multimodal Question Generation [60.33494376081317]
Multimodal Retrieval Augmented Generation (MMRAG) is a powerful approach to question-answering over multimodal documents. We propose SMMQG, a synthetic data generation framework that generates question and answer pairs directly from multimodal documents. We use SMMQG to generate an MMRAG dataset of 1024 questions over Wikipedia documents and evaluate state-of-the-art models using it.
arXiv Detail & Related papers (2024-07-02T12:57:42Z)
MMCL: Boosting Deformable DETR-Based Detectors with Multi-Class Min-Margin Contrastive Learning for Superior Prohibited Item Detection [8.23801404004195]
Prohibited Item detection in X-ray images is one of the most effective security inspection methods. overlapping unique phenomena in X-ray images lead to the coupling of foreground and background features. We propose a Multi-Class Min-Margin Contrastive Learning (MMCL) method to clarify the category semantic information of content queries.
arXiv Detail & Related papers (2024-06-05T12:07:58Z)
An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition [49.45660055499103]
Zero-shot human skeleton-based action recognition aims to construct a model that can recognize actions outside the categories seen during training. Previous research has focused on aligning sequences' visual and semantic spatial distributions. We introduce a new loss function sampling method to obtain a tight and robust representation.
arXiv Detail & Related papers (2024-06-02T06:53:01Z)
PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model [49.80313655590392]
PSALM is a powerful extension of the Large Multi-modal Model (LMM) to address the segmentation task challenges. It incorporates a mask decoder and a well-designed input schema to handle a variety of segmentation tasks. The flexible design of PSALM supports joint training across multiple datasets and tasks, leading to improved performance and task generalization.
arXiv Detail & Related papers (2024-03-21T17:50:47Z)
Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement Learning [53.00683059396803]
Mask image model (MIM) has been widely used due to its simplicity and effectiveness in recovering original information from masked images. We propose a decision-based MIM that utilizes reinforcement learning (RL) to automatically search for optimal image masking ratio and masking strategy. Our approach has a significant advantage over alternative self-supervised methods on the task of neuron segmentation.
arXiv Detail & Related papers (2023-10-06T10:40:46Z)
GaitMM: Multi-Granularity Motion Sequence Learning for Gait Recognition [6.877671230651998]
Gait recognition aims to identify individual-specific walking patterns by observing the different periodic movements of each body part. Most existing methods treat each part equally and fail to account for the data redundancy caused by the different step frequencies and sampling rates of gait. In this study, we propose a multi-granularity motion representation (GaitMM) for gait sequence learning.
arXiv Detail & Related papers (2022-09-18T04:07:33Z)
Fuzzy Cognitive Maps and Hidden Markov Models: Comparative Analysis of Efficiency within the Confines of the Time Series Classification Task [0.0]
We explore the application of Hidden Markov Model (HMM) for time series classification. We identify four models, HMM NN (HMM, one per series), HMM 1C (HMM, one per class), FCM NN, and FCM 1C are then studied in a series of experiments.
arXiv Detail & Related papers (2022-04-28T12:41:05Z)
Consistency and Diversity induced Human Motion Segmentation [231.36289425663702]
We propose a novel Consistency and Diversity induced human Motion (CDMS) algorithm. Our model factorizes the source and target data into distinct multi-layer feature spaces. A multi-mutual learning strategy is carried out to reduce the domain gap between the source and target data.
arXiv Detail & Related papers (2022-02-10T06:23:56Z)
Malware Classification with GMM-HMM Models [8.02151721194722]
In this paper, we use GMM-HMMs for malware classification and we compare our results to those obtained using discrete HMMs. For our opcode features, GMM-HMMs produce results that are comparable to those obtained using discrete HMMs.
arXiv Detail & Related papers (2021-03-03T23:23:48Z)
Multi-Granularity Reference-Aided Attentive Feature Aggregation for Video-based Person Re-identification [98.7585431239291]
Video-based person re-identification aims at matching the same person across video clips. In this paper, we propose an attentive feature aggregation module, namely Multi-Granularity Reference-Attentive Feature aggregation module MG-RAFA. Our framework achieves the state-of-the-art ablation performance on three benchmark datasets.
arXiv Detail & Related papers (2020-03-27T03:49:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.