Identifying Auxiliary or Adversarial Tasks Using Necessary Condition
Analysis for Adversarial Multi-task Video Understanding
- URL: http://arxiv.org/abs/2208.10077v1
- Date: Mon, 22 Aug 2022 06:26:11 GMT
- Title: Identifying Auxiliary or Adversarial Tasks Using Necessary Condition
Analysis for Adversarial Multi-task Video Understanding
- Authors: Stephen Su, Samuel Kwong, Qingyu Zhao, De-An Huang, Juan Carlos
Niebles, Ehsan Adeli
- Abstract summary: We propose a generalized notion of multi-task learning by incorporating both auxiliary tasks that the model should perform well on and adversarial tasks that the model should not perform well on.
Our novel proposed framework, Adversarial Multi-Task Neural Networks (AMT), penalizes adversarial tasks, determined by NCA to be scene recognition.
We show that our approach improves accuracy by 3% and encourages the model to attend to action features instead of correlation-biasing scene features.
- Score: 34.75145779372538
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There has been an increasing interest in multi-task learning for video
understanding in recent years. In this work, we propose a generalized notion of
multi-task learning by incorporating both auxiliary tasks that the model should
perform well on and adversarial tasks that the model should not perform well
on. We employ Necessary Condition Analysis (NCA) as a data-driven approach for
deciding what category these tasks should fall in. Our novel proposed
framework, Adversarial Multi-Task Neural Networks (AMT), penalizes adversarial
tasks, determined by NCA to be scene recognition in the Holistic Video
Understanding (HVU) dataset, to improve action recognition. This upends the
common assumption that the model should always be encouraged to do well on all
tasks in multi-task learning. Simultaneously, AMT still retains all the
benefits of multi-task learning as a generalization of existing methods and
uses object recognition as an auxiliary task to aid action recognition. We
introduce two challenging Scene-Invariant test splits of HVU, where the model
is evaluated on action-scene co-occurrences not encountered in training. We
show that our approach improves accuracy by ~3% and encourages the model to
attend to action features instead of correlation-biasing scene features.
Related papers
- Temporal2Seq: A Unified Framework for Temporal Video Understanding Tasks [26.007846170517055]
We propose a single unified framework, coined as Temporal2Seq, to formulate the output of temporal video understanding tasks as a sequence of discrete tokens.
With this unified token representation, Temporal2Seq can train a generalist model within a single architecture on different video understanding tasks.
We evaluate our Temporal2Seq generalist model on the corresponding test sets of three tasks, demonstrating that Temporal2Seq can produce reasonable results on various tasks.
arXiv Detail & Related papers (2024-09-27T06:37:47Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Composite Learning for Robust and Effective Dense Predictions [81.2055761433725]
Multi-task learning promises better model generalization on a target task by jointly optimizing it with an auxiliary task.
We find that jointly training a dense prediction (target) task with a self-supervised (auxiliary) task can consistently improve the performance of the target task, while eliminating the need for labeling auxiliary tasks.
arXiv Detail & Related papers (2022-10-13T17:59:16Z) - Explaining the Effectiveness of Multi-Task Learning for Efficient
Knowledge Extraction from Spine MRI Reports [2.5953185061765884]
We show that a single multi-tasking model can match the performance of task specific models.
We validate our observations on our internal radiologist-annotated datasets on the cervical and lumbar spine.
arXiv Detail & Related papers (2022-05-06T01:51:19Z) - Human-Centered Prior-Guided and Task-Dependent Multi-Task Representation
Learning for Action Recognition Pre-Training [8.571437792425417]
We propose a novel action recognition pre-training framework, which exploits human-centered prior knowledge that generates more informative representation.
Specifically, we distill knowledge from a human parsing model to enrich the semantic capability of representation.
In addition, we combine knowledge distillation with contrastive learning to constitute a task-dependent multi-task framework.
arXiv Detail & Related papers (2022-04-27T06:51:31Z) - On Steering Multi-Annotations per Sample for Multi-Task Learning [79.98259057711044]
The study of multi-task learning has drawn great attention from the community.
Despite the remarkable progress, the challenge of optimally learning different tasks simultaneously remains to be explored.
Previous works attempt to modify the gradients from different tasks. Yet these methods give a subjective assumption of the relationship between tasks, and the modified gradient may be less accurate.
In this paper, we introduce Task Allocation(STA), a mechanism that addresses this issue by a task allocation approach, in which each sample is randomly allocated a subset of tasks.
For further progress, we propose Interleaved Task Allocation(ISTA) to iteratively allocate all
arXiv Detail & Related papers (2022-03-06T11:57:18Z) - Combining Modular Skills in Multitask Learning [149.8001096811708]
A modular design encourages neural models to disentangle and recombine different facets of knowledge to generalise more systematically to new tasks.
In this work, we assume each task is associated with a subset of latent discrete skills from a (potentially small) inventory.
We find that the modular design of a network significantly increases sample efficiency in reinforcement learning and few-shot generalisation in supervised learning.
arXiv Detail & Related papers (2022-02-28T16:07:19Z) - Multi-View representation learning in Multi-Task Scene [4.509968166110557]
We propose a novel semi-supervised algorithm, termed as Multi-Task Multi-View learning based on Common and Special Features (MTMVCSF)
An anti-noise multi-task multi-view algorithm called AN-MTMVCSF is proposed, which has a strong adaptability to noise labels.
The effectiveness of these algorithms is proved by a series of well-designed experiments on both real world and synthetic data.
arXiv Detail & Related papers (2022-01-15T11:26:28Z) - Learning to Relate Depth and Semantics for Unsupervised Domain
Adaptation [87.1188556802942]
We present an approach for encoding visual task relationships to improve model performance in an Unsupervised Domain Adaptation (UDA) setting.
We propose a novel Cross-Task Relation Layer (CTRL), which encodes task dependencies between the semantic and depth predictions.
Furthermore, we propose an Iterative Self-Learning (ISL) training scheme, which exploits semantic pseudo-labels to provide extra supervision on the target domain.
arXiv Detail & Related papers (2021-05-17T13:42:09Z) - Distribution Matching for Heterogeneous Multi-Task Learning: a
Large-scale Face Study [75.42182503265056]
Multi-Task Learning has emerged as a methodology in which multiple tasks are jointly learned by a shared learning algorithm.
We deal with heterogeneous MTL, simultaneously addressing detection, classification & regression problems.
We build FaceBehaviorNet, the first framework for large-scale face analysis, by jointly learning all facial behavior tasks.
arXiv Detail & Related papers (2021-05-08T22:26:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.