Related papers: Model Composition: Can Multiple Neural Networks Be Combined into a Single Network Using Only Unlabeled Data?

Model Composition: Can Multiple Neural Networks Be Combined into a Single Network Using Only Unlabeled Data?

URL: http://arxiv.org/abs/2110.10369v1
Date: Wed, 20 Oct 2021 04:17:25 GMT
Title: Model Composition: Can Multiple Neural Networks Be Combined into a Single Network Using Only Unlabeled Data?
Authors: Amin Banitalebi-Dehkordi, Xinyu Kang, and Yong Zhang
Abstract summary: This paper investigates the idea of combining multiple trained neural networks using unlabeled data. To this end, the proposed method makes use of generation, filtering, and aggregation of reliable pseudo-labels collected from unlabeled data. Our method supports using an arbitrary number of input models with arbitrary architectures and categories.
Score: 6.0945220518329855
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The diversity of deep learning applications, datasets, and neural network architectures necessitates a careful selection of the architecture and data that match best to a target application. As an attempt to mitigate this dilemma, this paper investigates the idea of combining multiple trained neural networks using unlabeled data. In addition, combining multiple models into one can speed up the inference, result in stronger, more capable models, and allows us to select efficient device-friendly target network architectures. To this end, the proposed method makes use of generation, filtering, and aggregation of reliable pseudo-labels collected from unlabeled data. Our method supports using an arbitrary number of input models with arbitrary architectures and categories. Extensive performance evaluations demonstrated that our method is very effective. For example, for the task of object detection and without using any ground-truth labels, an EfficientDet-D0 trained on Pascal-VOC and an EfficientDet-D1 trained on COCO, can be combined to a RetinaNet-ResNet50 model, with a similar mAP as the supervised training. If fine-tuned in a semi-supervised setting, the combined model achieves +18.6%, +12.6%, and +8.1% mAP improvements over supervised training with 1%, 5%, and 10% of labels.

Related papers

Automated Label Unification for Multi-Dataset Semantic Segmentation with GNNs [48.406728896785296]
We propose a novel approach to automatically construct a unified label space across multiple datasets using graph neural networks. Unlike existing methods, our approach facilitates seamless training without the need for additional manual reannotation or taxonomy reconciliation.
arXiv Detail & Related papers (2024-07-15T08:42:10Z)
Dataset Quantization [72.61936019738076]
We present dataset quantization (DQ), a new framework to compress large-scale datasets into small subsets. DQ is the first method that can successfully distill large-scale datasets such as ImageNet-1k with a state-of-the-art compression ratio.
arXiv Detail & Related papers (2023-08-21T07:24:29Z)
Prompt Tuning for Parameter-efficient Medical Image Segmentation [79.09285179181225]
We propose and investigate several contributions to achieve a parameter-efficient but effective adaptation for semantic segmentation on two medical imaging datasets. We pre-train this architecture with a dedicated dense self-supervision scheme based on assignments to online generated prototypes. We demonstrate that the resulting neural network model is able to attenuate the gap between fully fine-tuned and parameter-efficiently adapted models.
arXiv Detail & Related papers (2022-11-16T21:55:05Z)
Infinite Recommendation Networks: A Data-Centric Approach [8.044430277912936]
We leverage the Neural Tangent Kernel to train infinitely-wide neural networks to devise $infty$-AE: an autoencoder with infinitely-wide bottleneck layers. We also develop Distill-CF for synthesizing tiny, high-fidelity data summaries. We observe 96-105% of $infty$-AE's performance on the full dataset with as little as 0.1% of the original dataset size.
arXiv Detail & Related papers (2022-06-03T00:34:13Z)
Multi network InfoMax: A pre-training method involving graph convolutional networks [0.0]
This paper presents a pre-training method involving graph convolutional/neural networks (GCNs/GNNs) The learned high-level graph latent representations help increase performance for downstream graph classification tasks. We apply our method to a neuroimaging dataset for classifying subjects into healthy control (HC) and schizophrenia (SZ) groups.
arXiv Detail & Related papers (2021-11-01T21:53:20Z)
Single-stream CNN with Learnable Architecture for Multi-source Remote Sensing Data [16.810239678639288]
We propose an efficient framework based on deep convolutional neural network (CNN) for multi-source remote sensing data joint classification. The proposed method can theoretically adjust any modern CNN models to any multi-source remote sensing data set. Experimental results demonstrate the effectiveness of the proposed single-stream CNNs.
arXiv Detail & Related papers (2021-09-13T16:10:41Z)
No Fear of Heterogeneity: Classifier Calibration for Federated Learning with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data. We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model. Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z)
Task-Adaptive Neural Network Retrieval with Meta-Contrastive Learning [34.27089256930098]
We propose a novel neural network retrieval method, which retrieves the most optimal pre-trained network for a given task. We train this framework by meta-learning a cross-modal latent space with contrastive loss, to maximize the similarity between a dataset and a network. We validate the efficacy of our method on ten real-world datasets, against existing NAS baselines.
arXiv Detail & Related papers (2021-03-02T06:30:51Z)
Solving Mixed Integer Programs Using Neural Networks [57.683491412480635]
This paper applies learning to the two key sub-tasks of a MIP solver, generating a high-quality joint variable assignment, and bounding the gap in objective value between that assignment and an optimal one. Our approach constructs two corresponding neural network-based components, Neural Diving and Neural Branching, to use in a base MIP solver such as SCIP. We evaluate our approach on six diverse real-world datasets, including two Google production datasets and MIPLIB, by training separate neural networks on each.
arXiv Detail & Related papers (2020-12-23T09:33:11Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.