Towards Class-agnostic Tracking Using Feature Decorrelation in Point
Clouds
- URL: http://arxiv.org/abs/2202.13524v1
- Date: Mon, 28 Feb 2022 03:33:03 GMT
- Title: Towards Class-agnostic Tracking Using Feature Decorrelation in Point
Clouds
- Authors: Shengjing Tian, Jun Liu, and Xiuping Liu
- Abstract summary: Single object tracking in point clouds has been attracting more and more attention owing to the presence of LiDAR sensors in 3D vision.
Existing methods based on deep neural networks focus mainly on training different models for different categories.
In this work, we turn our thoughts to a more challenging task in the LiDAR point clouds, class-agnostic tracking.
- Score: 9.321928362927965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Single object tracking in point clouds has been attracting more and more
attention owing to the presence of LiDAR sensors in 3D vision. However, the
existing methods based on deep neural networks focus mainly on training
different models for different categories, which makes them unable to perform
well in real-world applications when encountering classes unseen during the
training phase. In this work, we thus turn our thoughts to a more challenging
task in the LiDAR point clouds, class-agnostic tracking, where a general model
is supposed to be learned for any specified targets of both observed and unseen
categories. In particular, we first investigate the class-agnostic performances
of the state-of-the-art trackers via exposing the unseen categories to them
during testing, finding that a key factor for class-agnostic tracking is how to
constrain fused features between the template and search region to maintain
generalization when the distribution is shifted from observed to unseen
classes. Therefore, we propose a feature decorrelation method to address this
problem, which eliminates the spurious correlations of the fused features
through a set of learned weights and further makes the search region consistent
among foreground points and distinctive between foreground and background
points. Experiments on the KITTI and NuScenes demonstrate that the proposed
method can achieve considerable improvements by benchmarking against the
advanced trackers P2B and BAT, especially when tracking unseen objects.
Related papers
- Point Cloud Understanding via Attention-Driven Contrastive Learning [64.65145700121442]
Transformer-based models have advanced point cloud understanding by leveraging self-attention mechanisms.
PointACL is an attention-driven contrastive learning framework designed to address these limitations.
Our method employs an attention-driven dynamic masking strategy that guides the model to focus on under-attended regions.
arXiv Detail & Related papers (2024-11-22T05:41:00Z) - Debiased Novel Category Discovering and Localization [40.02326438622898]
We focus on the challenging problem of Novel Class Discovery and Localization (NCDL)
We propose an Debiased Region Mining (DRM) approach that combines class-agnostic Region Proposal Network (RPN) and class-aware RPN.
We conduct extensive experiments on the NCDL benchmark, and the results demonstrate that the proposed DRM approach significantly outperforms previous methods.
arXiv Detail & Related papers (2024-02-29T03:09:16Z) - Weakly-Supervised Action Localization by Hierarchically-structured
Latent Attention Modeling [19.683714649646603]
Weakly-supervised action localization aims to recognize and localize action instancese in untrimmed videos with only video-level labels.
Most existing models rely on multiple instance learning(MIL), where predictions of unlabeled instances are supervised by classifying labeled bags.
We propose a novel attention-based hierarchically-structured latent model to learn the temporal variations of feature semantics.
arXiv Detail & Related papers (2023-08-19T08:45:49Z) - Adaptive Base-class Suppression and Prior Guidance Network for One-Shot
Object Detection [9.44806128120871]
One-shot object detection (OSOD) aims to detect all object instances towards the given category specified by a query image.
We propose a novel framework, namely Base-class Suppression and Prior Guidance ( BSPG) network to overcome the problem.
Specifically, the objects of base categories can be explicitly detected by a base-class predictor and adaptively eliminated by our base-class suppression module.
A prior guidance module is designed to calculate the correlation of high-level features in a non-parametric manner, producing a class-agnostic prior map to provide the target features with rich semantic cues and guide the subsequent detection process
arXiv Detail & Related papers (2023-03-24T19:04:30Z) - Few-Shot Point Cloud Semantic Segmentation via Contrastive
Self-Supervision and Multi-Resolution Attention [6.350163959194903]
We propose a contrastive self-supervision framework for few-shot learning pretrain.
Specifically, we implement a novel contrastive learning approach with a learnable augmentor for a 3D point cloud.
We develop a multi-resolution attention module using both the nearest and farthest points to extract the local and global point information more effectively.
arXiv Detail & Related papers (2023-02-21T07:59:31Z) - Learning Classifiers of Prototypes and Reciprocal Points for Universal
Domain Adaptation [79.62038105814658]
Universal Domain aims to transfer the knowledge between datasets by handling two shifts: domain-shift and categoryshift.
Main challenge is correctly distinguishing the unknown target samples while adapting the distribution of known class knowledge from source to target.
Most existing methods approach this problem by first training the target adapted known and then relying on the single threshold to distinguish unknown target samples.
arXiv Detail & Related papers (2022-12-16T09:01:57Z) - Zero-Shot Temporal Action Detection via Vision-Language Prompting [134.26292288193298]
We propose a novel zero-Shot Temporal Action detection model via Vision-LanguagE prompting (STALE)
Our model significantly outperforms state-of-the-art alternatives.
Our model also yields superior results on supervised TAD over recent strong competitors.
arXiv Detail & Related papers (2022-07-17T13:59:46Z) - SASA: Semantics-Augmented Set Abstraction for Point-based 3D Object
Detection [78.90102636266276]
We propose a novel set abstraction method named Semantics-Augmented Set Abstraction (SASA)
Based on the estimated point-wise foreground scores, we then propose a semantics-guided point sampling algorithm to help retain more important foreground points during down-sampling.
In practice, SASA shows to be effective in identifying valuable points related to foreground objects and improving feature learning for point-based 3D detection.
arXiv Detail & Related papers (2022-01-06T08:54:47Z) - Activation to Saliency: Forming High-Quality Labels for Unsupervised
Salient Object Detection [54.92703325989853]
We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues.
No human annotations are involved in our framework during the whole training process.
Our framework reports significant performance compared with existing USOD methods.
arXiv Detail & Related papers (2021-12-07T11:54:06Z) - One Point is All You Need: Directional Attention Point for Feature
Learning [51.44837108615402]
We present a novel attention-based mechanism for learning enhanced point features for tasks such as point cloud classification and segmentation.
We show that our attention mechanism can be easily incorporated into state-of-the-art point cloud classification and segmentation networks.
arXiv Detail & Related papers (2020-12-11T11:45:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.