Beyond Transfer Learning: Co-finetuning for Action Localisation
- URL: http://arxiv.org/abs/2207.03807v1
- Date: Fri, 8 Jul 2022 10:25:47 GMT
- Title: Beyond Transfer Learning: Co-finetuning for Action Localisation
- Authors: Anurag Arnab, Xuehan Xiong, Alexey Gritsenko, Rob Romijnders, Josip
Djolonga, Mostafa Dehghani, Chen Sun, Mario Lu\v{c}i\'c, Cordelia Schmid
- Abstract summary: We propose co-finetuning -- simultaneously training a single model on multiple upstream'' and downstream'' tasks.
We demonstrate that co-finetuning outperforms traditional transfer learning when using the same total amount of data.
We also show how we can easily extend our approach to multiple upstream'' datasets to further improve performance.
- Score: 64.07196901012153
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transfer learning is the predominant paradigm for training deep networks on
small target datasets. Models are typically pretrained on large ``upstream''
datasets for classification, as such labels are easy to collect, and then
finetuned on ``downstream'' tasks such as action localisation, which are
smaller due to their finer-grained annotations. In this paper, we question this
approach, and propose co-finetuning -- simultaneously training a single model
on multiple ``upstream'' and ``downstream'' tasks. We demonstrate that
co-finetuning outperforms traditional transfer learning when using the same
total amount of data, and also show how we can easily extend our approach to
multiple ``upstream'' datasets to further improve performance. In particular,
co-finetuning significantly improves the performance on rare classes in our
downstream task, as it has a regularising effect, and enables the network to
learn feature representations that transfer between different datasets.
Finally, we observe how co-finetuning with public, video classification
datasets, we are able to achieve state-of-the-art results for spatio-temporal
action localisation on the challenging AVA and AVA-Kinetics datasets,
outperforming recent works which develop intricate models.
Related papers
- Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training [44.790636524264]
Point Prompt Training is a novel framework for multi-dataset synergistic learning in the context of 3D representation learning.
It can overcome the negative transfer associated with synergistic learning and produce generalizable representations.
It achieves state-of-the-art performance on each dataset using a single weight-shared model with supervised multi-dataset training.
arXiv Detail & Related papers (2023-08-18T17:59:57Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - AD-PT: Autonomous Driving Pre-Training with Large-scale Point Cloud
Dataset [25.935496432142976]
It is a long-term vision for Autonomous Driving (AD) community that the perception models can learn from a large-scale point cloud dataset.
We formulate the point-cloud pre-training task as a semi-supervised problem, which leverages the few-shot labeled and massive unlabeled point-cloud data.
We achieve significant performance gains on a series of downstream perception benchmarks including nuScenes, and KITTI, under different baseline models.
arXiv Detail & Related papers (2023-06-01T12:32:52Z) - Continual Learning with Optimal Transport based Mixture Model [17.398605698033656]
We propose an online mixture model learning approach based on nice properties of the mature optimal transport theory (OT-MM)
Our proposed method can significantly outperform the current state-of-the-art baselines.
arXiv Detail & Related papers (2022-11-30T06:40:29Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - How Well Do Sparse Imagenet Models Transfer? [75.98123173154605]
Transfer learning is a classic paradigm by which models pretrained on large "upstream" datasets are adapted to yield good results on "downstream" datasets.
In this work, we perform an in-depth investigation of this phenomenon in the context of convolutional neural networks (CNNs) trained on the ImageNet dataset.
We show that sparse models can match or even outperform the transfer performance of dense models, even at high sparsities.
arXiv Detail & Related papers (2021-11-26T11:58:51Z) - Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm.
We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data.
Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.