Related papers: Semi-Supervised Learning with Multi-Head Co-Training

Semi-Supervised Learning with Multi-Head Co-Training

URL: http://arxiv.org/abs/2107.04795v1
Date: Sat, 10 Jul 2021 08:53:14 GMT
Title: Semi-Supervised Learning with Multi-Head Co-Training
Authors: Mingcai Chen, Yuntao Du, Yi Zhang, Shuwei Qian, Chongjun Wang
Abstract summary: We present a simple and efficient co-training algorithm, named Multi-Head Co-Training, for semi-supervised image classification. Every classification head in the unified model interacts with its peers through a "Weak and Strong Augmentation" strategy. The effectiveness of Multi-Head Co-Training is demonstrated in an empirical study on standard semi-supervised learning benchmarks.
Score: 6.675682080298253
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Co-training, extended from self-training, is one of the frameworks for semi-supervised learning. It works at the cost of training extra classifiers, where the algorithm should be delicately designed to prevent individual classifiers from collapsing into each other. In this paper, we present a simple and efficient co-training algorithm, named Multi-Head Co-Training, for semi-supervised image classification. By integrating base learners into a multi-head structure, the model is in a minimal amount of extra parameters. Every classification head in the unified model interacts with its peers through a "Weak and Strong Augmentation" strategy, achieving single-view co-training without promoting diversity explicitly. The effectiveness of Multi-Head Co-Training is demonstrated in an empirical study on standard semi-supervised learning benchmarks.

Related papers

Graph Cut-guided Maximal Coding Rate Reduction for Learning Image Embedding and Clustering [2.4503870408262354]
We propose a unified framework, termed graph Cut-guided Maximal Coding Rate Reduction (CgMCR), for jointly learning the structured embeddings and the clustering. We conduct extensive experiments on both standard and out-of-domain image datasets and experimental results validate the effectiveness of our approach.
arXiv Detail & Related papers (2024-12-25T15:20:54Z)
Unified Multi-modal Unsupervised Representation Learning for Skeleton-based Action Understanding [62.70450216120704]
Unsupervised pre-training has shown great success in skeleton-based action understanding. We propose a Unified Multimodal Unsupervised Representation Learning framework, called UmURL. UmURL exploits an efficient early-fusion strategy to jointly encode the multi-modal features in a single-stream manner.
arXiv Detail & Related papers (2023-11-06T13:56:57Z)
FedCBO: Reaching Group Consensus in Clustered Federated Learning through Consensus-based Optimization [1.911678487931003]
Federated learning seeks to integrate the training learning models from multiple users, each user having their own data set, in a way that is sensitive to data privacy and to communication loss constraints. In this paper, we propose a novel solution to a global, clustered problem of federated learning that is inspired by ideas in consensus-based optimization (CBO) Our new CBO-type method is based on a system of interacting particles that is oblivious to group.
arXiv Detail & Related papers (2023-05-04T15:02:09Z)
FedIN: Federated Intermediate Layers Learning for Model Heterogeneity [7.781409257429762]
Federated learning (FL) facilitates edge devices to cooperatively train a global shared model while maintaining the training data locally and privately. In this study, we propose an FL method called Federated Intermediate Layers Learning (FedIN), supporting heterogeneous models without relying on any public dataset. Experiment results demonstrate the superior performance of FedIN in heterogeneous model environments compared to state-of-the-art algorithms.
arXiv Detail & Related papers (2023-04-03T07:20:43Z)
Personalizing Federated Learning with Over-the-Air Computations [84.8089761800994]
Federated edge learning is a promising technology to deploy intelligence at the edge of wireless networks in a privacy-preserving manner. Under such a setting, multiple clients collaboratively train a global generic model under the coordination of an edge server. This paper presents a distributed training paradigm that employs analog over-the-air computation to address the communication bottleneck.
arXiv Detail & Related papers (2023-02-24T08:41:19Z)
Multi-head Ensemble of Smoothed Classifiers for Certified Robustness [30.143629319940427]
Randomized Smoothing (RS) is a promising technique for certified robustness. Recent in RS the ensemble of multiple Deep Neural Networks (DNNs) has shown state-of-the-art performances. We consider a novel ensemble-based training way for a single DNN with multiple augmented heads, named as SmOothed Multi-head Ensemble (SOME)
arXiv Detail & Related papers (2022-11-20T06:31:53Z)
Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information [77.80071279597665]
We propose an all-in-one single-stage pre-training approach, named Maximizing Multi-modal Mutual Information Pre-training (M3I Pre-training) Our approach achieves better performance than previous pre-training methods on various vision benchmarks, including ImageNet classification, object detection, LVIS long-tailed object detection, and ADE20k semantic segmentation.
arXiv Detail & Related papers (2022-11-17T18:59:49Z)
Effective Adaptation in Multi-Task Co-Training for Unified Autonomous Driving [103.745551954983]
In this paper, we investigate the transfer performance of various types of self-supervised methods, including MoCo and SimCLR, on three downstream tasks. We find that their performances are sub-optimal or even lag far behind the single-task baseline. We propose a simple yet effective pretrain-adapt-finetune paradigm for general multi-task training.
arXiv Detail & Related papers (2022-09-19T12:15:31Z)
i-Code: An Integrative and Composable Multimodal Learning Framework [99.56065789066027]
i-Code is a self-supervised pretraining framework where users may flexibly combine the modalities of vision, speech, and language into unified and general-purpose vector representations. The entire system is pretrained end-to-end with new objectives including masked modality unit modeling and cross-modality contrastive learning. Experimental results demonstrate how i-Code can outperform state-of-the-art techniques on five video understanding tasks and the GLUE NLP benchmark, improving by as much as 11%.
arXiv Detail & Related papers (2022-05-03T23:38:50Z)
Efficient Diversity-Driven Ensemble for Deep Neural Networks [28.070540722925152]
We propose Efficient Diversity-Driven Ensemble (EDDE) to address both the diversity and the efficiency of an ensemble. Compared with other well-known ensemble methods, EDDE can get highest ensemble accuracy with the lowest training cost. We evaluate EDDE on Computer Vision (CV) and Natural Language Processing (NLP) tasks.
arXiv Detail & Related papers (2021-12-26T04:28:47Z)
Dual Path Structural Contrastive Embeddings for Learning Novel Objects [6.979491536753043]
Recent research shows that gaining information on a good feature space can be an effective solution to achieve favorable performance on few-shot tasks. We propose a simple but effective paradigm that decouples the tasks of learning feature representations and classifiers. Our method can still achieve promising results for both standard and generalized few-shot problems in either an inductive or transductive inference setting.
arXiv Detail & Related papers (2021-12-23T04:43:31Z)
Curriculum Meta-Learning for Few-shot Classification [1.5039745292757671]
We propose an adaptation of the curriculum training framework, applicable to state-of-the-art meta learning techniques for few-shot classification. Our experiments with the MAML algorithm on two few-shot image classification tasks show significant gains with the curriculum training framework.
arXiv Detail & Related papers (2021-12-06T10:29:23Z)
Learning Diverse Representations for Fast Adaptation to Distribution Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task. We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.