Temporal Action Detection with Multi-level Supervision
- URL: http://arxiv.org/abs/2011.11893v2
- Date: Thu, 18 Feb 2021 10:23:05 GMT
- Title: Temporal Action Detection with Multi-level Supervision
- Authors: Baifeng Shi, Qi Dai, Judy Hoffman, Kate Saenko, Trevor Darrell,
Huijuan Xu
- Abstract summary: We introduce the Semi-supervised Action Detection (SSAD) task with a mixture of labeled and unlabeled data.
We analyze different types of errors in the proposed SSAD baselines which are directly adapted from the semi-supervised classification task.
We incorporate weakly-labeled data into SSAD and propose Omni-supervised Action Detection (OSAD) with three levels of supervision.
- Score: 116.55596693897388
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training temporal action detection in videos requires large amounts of
labeled data, yet such annotation is expensive to collect. Incorporating
unlabeled or weakly-labeled data to train action detection model could help
reduce annotation cost. In this work, we first introduce the Semi-supervised
Action Detection (SSAD) task with a mixture of labeled and unlabeled data and
analyze different types of errors in the proposed SSAD baselines which are
directly adapted from the semi-supervised classification task. To alleviate the
main error of action incompleteness (i.e., missing parts of actions) in SSAD
baselines, we further design an unsupervised foreground attention (UFA) module
utilizing the "independence" between foreground and background motion. Then we
incorporate weakly-labeled data into SSAD and propose Omni-supervised Action
Detection (OSAD) with three levels of supervision. An information bottleneck
(IB) suppressing the scene information in non-action frames while preserving
the action information is designed to help overcome the accompanying
action-context confusion problem in OSAD baselines. We extensively benchmark
against the baselines for SSAD and OSAD on our created data splits in THUMOS14
and ActivityNet1.2, and demonstrate the effectiveness of the proposed UFA and
IB methods. Lastly, the benefit of our full OSAD-IB model under limited
annotation budgets is shown by exploring the optimal annotation strategy for
labeled, unlabeled and weakly-labeled data.
Related papers
- Semi-supervised Open-World Object Detection [74.95267079505145]
We introduce a more realistic formulation, named semi-supervised open-world detection (SS-OWOD)
We demonstrate that the performance of the state-of-the-art OWOD detector dramatically deteriorates in the proposed SS-OWOD setting.
Our experiments on 4 datasets including MS COCO, PASCAL, Objects365 and DOTA demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-02-25T07:12:51Z) - Open-Set Semi-Supervised Object Detection [43.464223594166654]
Recent developments for Semi-Supervised Object Detection (SSOD) have shown the promise of leveraging unlabeled data to improve an object detector.
We consider a more practical yet challenging problem, Open-Set Semi-Supervised Object Detection (OSSOD)
Our proposed framework effectively addresses the semantic expansion issue and shows consistent improvements on many OSSOD benchmarks.
arXiv Detail & Related papers (2022-08-29T17:04:30Z) - SIOD: Single Instance Annotated Per Category Per Image for Object
Detection [67.64774488115299]
We propose the Single Instance annotated Object Detection (SIOD), requiring only one instance annotation for each existing category in an image.
Degraded from inter-task (WSOD) or inter-image (SSOD) discrepancies to the intra-image discrepancy, SIOD provides more reliable and rich prior knowledge for mining the rest of unlabeled instances.
Under the SIOD setting, we propose a simple yet effective framework, termed Dual-Mining (DMiner), which consists of a Similarity-based Pseudo Label Generating module (SPLG) and a Pixel-level Group Contrastive Learning module (PGCL)
arXiv Detail & Related papers (2022-03-29T08:49:51Z) - Data-Efficient and Interpretable Tabular Anomaly Detection [54.15249463477813]
We propose a novel framework that adapts a white-box model class, Generalized Additive Models, to detect anomalies.
In addition, the proposed framework, DIAD, can incorporate a small amount of labeled data to further boost anomaly detection performances in semi-supervised settings.
arXiv Detail & Related papers (2022-03-03T22:02:56Z) - WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection [75.80075054706079]
We propose a weakly- and semi-supervised object detection framework (WSSOD)
An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images.
The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings.
arXiv Detail & Related papers (2021-05-21T11:58:50Z) - EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with
Cascade Refinement [53.69674636044927]
We present EHSOD, an end-to-end hybrid-supervised object detection system.
It can be trained in one shot on both fully and weakly-annotated data.
It achieves comparable results on multiple object detection benchmarks with only 30% fully-annotated data.
arXiv Detail & Related papers (2020-02-18T08:04:58Z) - Task-Aware Variational Adversarial Active Learning [42.334671410592065]
We propose task-aware variational adversarial AL (TA-VAAL) that modifies task-agnostic VAAL.
Our proposed TA-VAAL outperforms state-of-the-arts on various benchmark datasets for classifications with balanced / imbalanced labels.
arXiv Detail & Related papers (2020-02-11T22:00:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.