Don't miss the Mismatch: Investigating the Objective Function Mismatch
for Unsupervised Representation Learning
- URL: http://arxiv.org/abs/2009.02383v2
- Date: Mon, 28 Feb 2022 20:15:42 GMT
- Title: Don't miss the Mismatch: Investigating the Objective Function Mismatch
for Unsupervised Representation Learning
- Authors: Bonifaz Stuhr, J\"urgen Brauer
- Abstract summary: This work builds upon the widely used linear evaluation protocol to define new general evaluation metrics.
We study mismatches in pretext and target tasks and study mismatches in a wide range of experiments.
In our experiments, we find that the objective function mismatch reduces performance by 0.1-5.0% for Cifar10, Cifar100 and PCam in many setups, and up to 25-59% in extreme cases for the 3dshapes dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Finding general evaluation metrics for unsupervised representation learning
techniques is a challenging open research question, which recently has become
more and more necessary due to the increasing interest in unsupervised methods.
Even though these methods promise beneficial representation characteristics,
most approaches currently suffer from the objective function mismatch. This
mismatch states that the performance on a desired target task can decrease when
the unsupervised pretext task is learned too long - especially when both tasks
are ill-posed. In this work, we build upon the widely used linear evaluation
protocol and define new general evaluation metrics to quantitatively capture
the objective function mismatch and the more generic metrics mismatch. We
discuss the usability and stability of our protocols on a variety of pretext
and target tasks and study mismatches in a wide range of experiments. Thereby
we disclose dependencies of the objective function mismatch across several
pretext and target tasks with respect to the pretext model's representation
size, target model complexity, pretext and target augmentations as well as
pretext and target task types. In our experiments, we find that the objective
function mismatch reduces performance by ~0.1-5.0% for Cifar10, Cifar100 and
PCam in many setups, and up to ~25-59% in extreme cases for the 3dshapes
dataset.
Related papers
- Neural Network Training and Non-Differentiable Objective Functions [2.3351527694849574]
This thesis makes four main contributions toward bridging the gap between the non-differentiable objective and the training loss function.
The contributions of this thesis make the training of neural networks more scalable.
arXiv Detail & Related papers (2023-05-03T10:28:23Z) - Exposing and Addressing Cross-Task Inconsistency in Unified
Vision-Language Models [80.23791222509644]
Inconsistent AI models are considered brittle and untrustworthy by human users.
We find that state-of-the-art vision-language models suffer from a surprisingly high degree of inconsistent behavior across tasks.
We propose a rank correlation-based auxiliary training objective, computed over large automatically created cross-task contrast sets.
arXiv Detail & Related papers (2023-03-28T16:57:12Z) - ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning [59.08197876733052]
Auxiliary-Task Learning (ATL) aims to improve the performance of the target task by leveraging the knowledge obtained from related tasks.
Sometimes, learning multiple tasks simultaneously results in lower accuracy than learning only the target task, known as negative transfer.
ForkMerge is a novel approach that periodically forks the model into multiple branches, automatically searches the varying task weights.
arXiv Detail & Related papers (2023-01-30T02:27:02Z) - AdAUC: End-to-end Adversarial AUC Optimization Against Long-tail
Problems [102.95119281306893]
We present an early trial to explore adversarial training methods to optimize AUC.
We reformulate the AUC optimization problem as a saddle point problem, where the objective becomes an instance-wise function.
Our analysis differs from the existing studies since the algorithm is asked to generate adversarial examples by calculating the gradient of a min-max problem.
arXiv Detail & Related papers (2022-06-24T09:13:39Z) - Generative multitask learning mitigates target-causing confounding [61.21582323566118]
We propose a simple and scalable approach to causal representation learning for multitask learning.
The improvement comes from mitigating unobserved confounders that cause the targets, but not the input.
Our results on the Attributes of People and Taskonomy datasets reflect the conceptual improvement in robustness to prior probability shift.
arXiv Detail & Related papers (2022-02-08T20:42:14Z) - Conflict-Averse Gradient Descent for Multi-task Learning [56.379937772617]
A major challenge in optimizing a multi-task model is the conflicting gradients.
We introduce Conflict-Averse Gradient descent (CAGrad) which minimizes the average loss function.
CAGrad balances the objectives automatically and still provably converges to a minimum over the average loss.
arXiv Detail & Related papers (2021-10-26T22:03:51Z) - CACTUS: Detecting and Resolving Conflicts in Objective Functions [16.784454432715712]
In multi-objective optimization, conflicting objectives and constraints is a major area of concern.
In this paper, we extend this line of work by prototyping a technique to visualize multi-objective objective functions.
We show that our technique helps users interactively specify meaningful objective functions by resolving potential conflicts for a classification task.
arXiv Detail & Related papers (2021-03-13T22:38:47Z) - Lifelong Learning Without a Task Oracle [13.331659934508764]
Supervised deep neural networks are known to undergo a sharp decline in the accuracy of older tasks when new tasks are learned.
We propose and compare several candidate task-assigning mappers which require very little memory overhead.
Best-performing variants only impose an average cost of 1.7% parameter memory increase.
arXiv Detail & Related papers (2020-11-09T21:30:31Z) - Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks.
In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other.
This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.