A Topological Approach for Motion Track Discrimination
- URL: http://arxiv.org/abs/2102.05705v1
- Date: Wed, 10 Feb 2021 19:25:38 GMT
- Title: A Topological Approach for Motion Track Discrimination
- Authors: Tegan Emerson, Sarah Tymochko, George Stantchev, Jason A. Edelberg,
Michael Wilson, and Colin C. Olson
- Abstract summary: We use characteristics of target tracks extracted from video sequences as data from which to derive distinguishing topological features.
In particular, we calculate persistent homology from time-delayed embeddings of dynamic statistics calculated from motion tracks extracted from a wide field-of-view video stream.
- Score: 10.72000349055617
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting small targets at range is difficult because there is not enough
spatial information present in an image sub-region containing the target to use
correlation-based methods to differentiate it from dynamic confusers present in
the scene. Moreover, this lack of spatial information also disqualifies the use
of most state-of-the-art deep learning image-based classifiers. Here, we use
characteristics of target tracks extracted from video sequences as data from
which to derive distinguishing topological features that help robustly
differentiate targets of interest from confusers. In particular, we calculate
persistent homology from time-delayed embeddings of dynamic statistics
calculated from motion tracks extracted from a wide field-of-view video stream.
In short, we use topological methods to extract features related to target
motion dynamics that are useful for classification and disambiguation and show
that small targets can be detected at range with high probability.
Related papers
- Leveraging Activations for Superpixel Explanations [2.8792218859042453]
Saliency methods have become standard in the explanation toolkit of deep neural networks.
In this paper, we aim to avoid relying on segmenters by extracting a segmentation from the activations of a deep neural network image classifier.
Our so-called Neuro-Activated Superpixels (NAS) can isolate the regions of interest in the input relevant to the model's prediction.
arXiv Detail & Related papers (2024-06-07T13:37:45Z) - Holistic Representation Learning for Multitask Trajectory Anomaly
Detection [65.72942351514956]
We propose a holistic representation of skeleton trajectories to learn expected motions across segments at different times.
We encode temporally occluded trajectories, jointly learn latent representations of the occluded segments, and reconstruct trajectories based on expected motions across different temporal segments.
arXiv Detail & Related papers (2023-11-03T11:32:53Z) - Multimodal Graph Learning for Deepfake Detection [10.077496841634135]
Existing deepfake detectors face several challenges in achieving robustness and generalization.
We propose a novel framework, namely Multimodal Graph Learning (MGL), that leverages information from multiple modalities.
Our proposed method aims to effectively identify and utilize distinguishing features for deepfake detection.
arXiv Detail & Related papers (2022-09-12T17:17:49Z) - Correlation-Aware Deep Tracking [83.51092789908677]
We propose a novel target-dependent feature network inspired by the self-/cross-attention scheme.
Our network deeply embeds cross-image feature correlation in multiple layers of the feature network.
Our model can be flexibly pre-trained on abundant unpaired images, leading to notably faster convergence than the existing methods.
arXiv Detail & Related papers (2022-03-03T11:53:54Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - Spatial-Temporal Correlation and Topology Learning for Person
Re-Identification in Videos [78.45050529204701]
We propose a novel framework to pursue discriminative and robust representation by modeling cross-scale spatial-temporal correlation.
CTL utilizes a CNN backbone and a key-points estimator to extract semantic local features from human body.
It explores a context-reinforced topology to construct multi-scale graphs by considering both global contextual information and physical connections of human body.
arXiv Detail & Related papers (2021-04-15T14:32:12Z) - Video Anomaly Detection by Estimating Likelihood of Representations [21.879366166261228]
Video anomaly is a challenging task because it involves solving many sub-tasks such as motion representation, object localization and action recognition.
Traditionally, solutions to this task have focused on the mapping between video frames and their low-dimensional features, while ignoring the spatial connections of those features.
Recent solutions focus on analyzing these spatial connections by using hard clustering techniques, such as K-Means, or applying neural networks to map latent features to a general understanding.
In order to solve video anomaly in the latent feature space, we propose a deep probabilistic model to transfer this task into a density estimation problem.
arXiv Detail & Related papers (2020-12-02T19:16:22Z) - Self-supervised Segmentation via Background Inpainting [96.10971980098196]
We introduce a self-supervised detection and segmentation approach that can work with single images captured by a potentially moving camera.
We exploit a self-supervised loss function that we exploit to train a proposal-based segmentation network.
We apply our method to human detection and segmentation in images that visually depart from those of standard benchmarks and outperform existing self-supervised methods.
arXiv Detail & Related papers (2020-11-11T08:34:40Z) - Benchmarking Unsupervised Object Representations for Video Sequences [111.81492107649889]
We compare the perceptual abilities of four object-centric approaches: ViMON, OP3, TBA and SCALOR.
Our results suggest that the architectures with unconstrained latent representations learn more powerful representations in terms of object detection, segmentation and tracking.
Our benchmark may provide fruitful guidance towards learning more robust object-centric video representations.
arXiv Detail & Related papers (2020-06-12T09:37:24Z) - Applying r-spatiogram in object tracking for occlusion handling [16.36552899280708]
The aim of video tracking is to accurately locate a moving target in a video sequence and discriminate target from non-targets in the feature space of the sequence.
In this paper, we use the basic idea of many trackers which consists of three main components of the reference model, i.e. object modeling, object detection and localization, and model updating.
arXiv Detail & Related papers (2020-03-18T02:42:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.