UniNet: A Unified Scene Understanding Network and Exploring Multi-Task
Relationships through the Lens of Adversarial Attacks
- URL: http://arxiv.org/abs/2108.04584v1
- Date: Tue, 10 Aug 2021 11:00:56 GMT
- Title: UniNet: A Unified Scene Understanding Network and Exploring Multi-Task
Relationships through the Lens of Adversarial Attacks
- Authors: NareshKumar Gurulingan, Elahe Arani, and Bahram Zonooz
- Abstract summary: Single task vision networks extract information only based on some aspects of the scene.
In multi-task learning (MTL), single tasks are jointly learned, thereby providing an opportunity for tasks to share information.
We develop UniNet, a unified scene understanding network that accurately and efficiently infers vital vision tasks.
- Score: 1.1470070927586016
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Scene understanding is crucial for autonomous systems which intend to operate
in the real world. Single task vision networks extract information only based
on some aspects of the scene. In multi-task learning (MTL), on the other hand,
these single tasks are jointly learned, thereby providing an opportunity for
tasks to share information and obtain a more comprehensive understanding. To
this end, we develop UniNet, a unified scene understanding network that
accurately and efficiently infers vital vision tasks including object
detection, semantic segmentation, instance segmentation, monocular depth
estimation, and monocular instance depth prediction. As these tasks look at
different semantic and geometric information, they can either complement or
conflict with each other. Therefore, understanding inter-task relationships can
provide useful cues to enable complementary information sharing. We evaluate
the task relationships in UniNet through the lens of adversarial attacks based
on the notion that they can exploit learned biases and task interactions in the
neural network. Extensive experiments on the Cityscapes dataset, using
untargeted and targeted attacks reveal that semantic tasks strongly interact
amongst themselves, and the same holds for geometric tasks. Additionally, we
show that the relationship between semantic and geometric tasks is asymmetric
and their interaction becomes weaker as we move towards higher-level
representations.
Related papers
- Visual-Geometric Collaborative Guidance for Affordance Learning [63.038406948791454]
We propose a visual-geometric collaborative guided affordance learning network that incorporates visual and geometric cues.
Our method outperforms the representative models regarding objective metrics and visual quality.
arXiv Detail & Related papers (2024-10-15T07:35:51Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Proto-Value Networks: Scaling Representation Learning with Auxiliary
Tasks [33.98624423578388]
Auxiliary tasks improve representations learned by deep reinforcement learning agents.
We derive a new family of auxiliary tasks based on the successor measure.
We show that proto-value networks produce rich features that may be used to obtain performance comparable to established algorithms.
arXiv Detail & Related papers (2023-04-25T04:25:08Z) - Learning Good Features to Transfer Across Tasks and Domains [16.05821129333396]
We first show that such knowledge can be shared across tasks by learning a mapping between task-specific deep features in a given domain.
Then, we show that this mapping function, implemented by a neural network, is able to generalize to novel unseen domains.
arXiv Detail & Related papers (2023-01-26T18:49:39Z) - Learning to Relate Depth and Semantics for Unsupervised Domain
Adaptation [87.1188556802942]
We present an approach for encoding visual task relationships to improve model performance in an Unsupervised Domain Adaptation (UDA) setting.
We propose a novel Cross-Task Relation Layer (CTRL), which encodes task dependencies between the semantic and depth predictions.
Furthermore, we propose an Iterative Self-Learning (ISL) training scheme, which exploits semantic pseudo-labels to provide extra supervision on the target domain.
arXiv Detail & Related papers (2021-05-17T13:42:09Z) - SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from
Monocular images [94.36401543589523]
We introduce the concept of semantic objectness to exploit the geometric relationship of these two tasks.
We then propose a Semantic Object and Depth Estimation Network (SOSD-Net) based on the objectness assumption.
To the best of our knowledge, SOSD-Net is the first network that exploits the geometry constraint for simultaneous monocular depth estimation and semantic segmentation.
arXiv Detail & Related papers (2021-01-19T02:41:03Z) - Understanding the Role of Individual Units in a Deep Neural Network [85.23117441162772]
We present an analytic framework to systematically identify hidden units within image classification and image generation networks.
First, we analyze a convolutional neural network (CNN) trained on scene classification and discover units that match a diverse set of object concepts.
Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
arXiv Detail & Related papers (2020-09-10T17:59:10Z) - Taskology: Utilizing Task Relations at Scale [28.09712466727001]
We show that we can leverage the inherent relationships among collections of tasks, as they are trained jointly.
explicitly utilizing the relationships between tasks allows improving their performance while dramatically reducing the need for labeled data.
We demonstrate our framework on subsets of the following collection of tasks: depth and normal prediction, semantic segmentation, 3D motion and ego-motion estimation, and object tracking and 3D detection in point clouds.
arXiv Detail & Related papers (2020-05-14T22:53:46Z) - LSM: Learning Subspace Minimization for Low-level Vision [78.27774638569218]
We replace the regularization term with a learnable subspace constraint, and preserve the data term to exploit domain knowledge.
This learning subspace minimization (LSM) framework unifies the network structures and the parameters for many low-level vision tasks.
We demonstrate our LSM framework on four low-level tasks including interactive image segmentation, video segmentation, stereo matching, and optical flow, and validate the network on various datasets.
arXiv Detail & Related papers (2020-04-20T10:49:38Z) - Context-Aware Multi-Task Learning for Traffic Scene Recognition in
Autonomous Vehicles [10.475998113861895]
We propose an algorithm to jointly learn the task-specific and shared representations by adopting a multi-task learning network.
Experiments on the large-scale dataset HSD demonstrate the effectiveness and superiority of our network over state-of-the-art methods.
arXiv Detail & Related papers (2020-04-03T03:09:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.