Efficient Multi-Model Fusion with Adversarial Complementary Representation Learning
- URL: http://arxiv.org/abs/2404.15704v1
- Date: Wed, 24 Apr 2024 07:47:55 GMT
- Title: Efficient Multi-Model Fusion with Adversarial Complementary Representation Learning
- Authors: Zuheng Kang, Yayun He, Jianzong Wang, Junqing Peng, Jing Xiao,
- Abstract summary: Single-model systems often suffer from deficiencies in tasks such as speaker verification (SV) and image classification.
We propose an adversarial complementary representation learning (ACoRL) framework that enables newly trained models to avoid previously acquired knowledge.
- Score: 26.393644289860084
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Single-model systems often suffer from deficiencies in tasks such as speaker verification (SV) and image classification, relying heavily on partial prior knowledge during decision-making, resulting in suboptimal performance. Although multi-model fusion (MMF) can mitigate some of these issues, redundancy in learned representations may limits improvements. To this end, we propose an adversarial complementary representation learning (ACoRL) framework that enables newly trained models to avoid previously acquired knowledge, allowing each individual component model to learn maximally distinct, complementary representations. We make three detailed explanations of why this works and experimental results demonstrate that our method more efficiently improves performance compared to traditional MMF. Furthermore, attribution analysis validates the model trained under ACoRL acquires more complementary knowledge, highlighting the efficacy of our approach in enhancing efficiency and robustness across tasks.
Related papers
- On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning [85.75164588939185]
We study the discriminative probabilistic modeling problem on a continuous domain for (multimodal) self-supervised representation learning.
We conduct generalization error analysis to reveal the limitation of current InfoNCE-based contrastive loss for self-supervised representation learning.
arXiv Detail & Related papers (2024-10-11T18:02:46Z) - Using Part-based Representations for Explainable Deep Reinforcement Learning [30.566205347443113]
We propose a non-negative training approach for actor models in Deep Reinforcement Learning.
We demonstrate the effectiveness of the proposed approach using the well-known Cartpole benchmark.
arXiv Detail & Related papers (2024-08-21T09:21:59Z) - FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models [50.331708897857574]
We introduce FactorLLM, a novel approach that decomposes well-trained dense FFNs into sparse sub-networks without requiring any further modifications.
FactorLLM achieves comparable performance to the source model securing up to 85% model performance while obtaining over a 30% increase in inference speed.
arXiv Detail & Related papers (2024-08-15T16:45:16Z) - Enhancing Fairness and Performance in Machine Learning Models: A Multi-Task Learning Approach with Monte-Carlo Dropout and Pareto Optimality [1.5498930424110338]
This study introduces an approach to mitigate bias in machine learning by leveraging model uncertainty.
Our approach utilizes a multi-task learning (MTL) framework combined with Monte Carlo (MC) Dropout to assess and mitigate uncertainty in predictions related to protected labels.
arXiv Detail & Related papers (2024-04-12T04:17:50Z) - Revealing Multimodal Contrastive Representation Learning through Latent
Partial Causal Models [85.67870425656368]
We introduce a unified causal model specifically designed for multimodal data.
We show that multimodal contrastive representation learning excels at identifying latent coupled variables.
Experiments demonstrate the robustness of our findings, even when the assumptions are violated.
arXiv Detail & Related papers (2024-02-09T07:18:06Z) - A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels.
We present a generative latent variable model for self-supervised learning.
We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z) - Improving the Modality Representation with Multi-View Contrastive
Learning for Multimodal Sentiment Analysis [15.623293264871181]
This study investigates the improvement approaches of modality representation with contrastive learning.
We devise a three-stages framework with multi-view contrastive learning to refine representations for the specific objectives.
We conduct experiments on three open datasets, and results show the advance of our model.
arXiv Detail & Related papers (2022-10-28T01:25:16Z) - Model Uncertainty-Aware Knowledge Amalgamation for Pre-Trained Language
Models [37.88287077119201]
We propose a novel model reuse paradigm, Knowledge Amalgamation(KA) for PLMs.
Without human annotations available, KA aims to merge the knowledge from different teacher-PLMs, each of which specializes in a different classification problem, into a versatile student model.
Experimental results demonstrate that MUKA achieves substantial improvements over baselines on benchmark datasets.
arXiv Detail & Related papers (2021-12-14T12:26:24Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.