A Semi-supervised Generative Model for Incomplete Multi-view Data Integration with Missing Labels
- URL: http://arxiv.org/abs/2508.11180v1
- Date: Fri, 15 Aug 2025 03:10:18 GMT
- Title: A Semi-supervised Generative Model for Incomplete Multi-view Data Integration with Missing Labels
- Authors: Yiyang Shen, Weiran Wang,
- Abstract summary: We propose a semi-supervised generative model that utilizes both labeled and unlabeled samples in a unified framework.<n>Compared to existing approaches, our model achieves better predictive and imputation performance on both image and multi-omics data with missing views and limited labeled samples.
- Score: 12.79532395630597
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-view learning is widely applied to real-life datasets, such as multiple omics biological data, but it often suffers from both missing views and missing labels. Prior probabilistic approaches addressed the missing view problem by using a product-of-experts scheme to aggregate representations from present views and achieved superior performance over deterministic classifiers, using the information bottleneck (IB) principle. However, the IB framework is inherently fully supervised and cannot leverage unlabeled data. In this work, we propose a semi-supervised generative model that utilizes both labeled and unlabeled samples in a unified framework. Our method maximizes the likelihood of unlabeled samples to learn a latent space shared with the IB on labeled data. We also perform cross-view mutual information maximization in the latent space to enhance the extraction of shared information across views. Compared to existing approaches, our model achieves better predictive and imputation performance on both image and multi-omics data with missing views and limited labeled samples.
Related papers
- Virtual Category Learning: A Semi-Supervised Learning Method for Dense
Prediction with Extremely Limited Labels [63.16824565919966]
This paper proposes to use confusing samples proactively without label correction.
A Virtual Category (VC) is assigned to each confusing sample in such a way that it can safely contribute to the model optimisation.
Our intriguing findings highlight the usage of VC learning in dense vision tasks.
arXiv Detail & Related papers (2023-12-02T16:23:52Z) - Deep Incomplete Multi-view Clustering with Cross-view Partial Sample and
Prototype Alignment [50.82982601256481]
We propose a Cross-view Partial Sample and Prototype Alignment Network (CPSPAN) for Deep Incomplete Multi-view Clustering.
Unlike existing contrastive-based methods, we adopt pair-observed data alignment as 'proxy supervised signals' to guide instance-to-instance correspondence construction.
arXiv Detail & Related papers (2023-03-28T02:31:57Z) - A Distinct Unsupervised Reference Model From The Environment Helps
Continual Learning [5.332329421663282]
Open-Set Semi-Supervised Continual Learning (OSSCL) is a more realistic semi-supervised continual learning setting.
We present a model with two distinct parts: (i) the reference network captures general-purpose and task-agnostic knowledge in the environment by using a broad spectrum of unlabeled samples, and (ii) the learner network is designed to learn task-specific representations by exploiting supervised samples.
arXiv Detail & Related papers (2023-01-11T15:05:36Z) - Self-supervised Image Clustering from Multiple Incomplete Views via
Constrastive Complementary Generation [5.314364096882052]
We propose Contrastive Incomplete Multi-View Image Clustering with Generative Adversarial Networks (CIMIC-GAN)
We incorporate autoencoding representation of complete and incomplete data into double contrastive learning to achieve learning consistency.
Experiments conducted on textcolorblackfour extensively-used datasets show that CIMIC-GAN outperforms state-of-the-art incomplete multi-View clustering methods.
arXiv Detail & Related papers (2022-09-24T05:08:34Z) - Improving Contrastive Learning on Imbalanced Seed Data via Open-World
Sampling [96.8742582581744]
We present an open-world unlabeled data sampling framework called Model-Aware K-center (MAK)
MAK follows three simple principles: tailness, proximity, and diversity.
We demonstrate that MAK can consistently improve both the overall representation quality and the class balancedness of the learned features.
arXiv Detail & Related papers (2021-11-01T15:09:41Z) - TSK Fuzzy System Towards Few Labeled Incomplete Multi-View Data
Classification [24.01191516774655]
A transductive semi-supervised incomplete multi-view TSK fuzzy system modeling method (SSIMV_TSK) is proposed to address these challenges.
The proposed method integrates missing view imputation, pseudo label learning of unlabeled data, and fuzzy system modeling into a single process to yield a model with interpretable fuzzy rules.
Experimental results on real datasets show that the proposed method significantly outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2021-10-08T11:41:06Z) - Label-Assemble: Leveraging Multiple Datasets with Partial Labels [68.46767639240564]
"Label-Assemble" aims to unleash the full potential of partial labels from an assembly of public datasets.
We discovered that learning from negative examples facilitates both computer-aided disease diagnosis and detection.
arXiv Detail & Related papers (2021-09-25T02:48:17Z) - Error-Robust Multi-View Clustering: Progress, Challenges and
Opportunities [67.54503077766171]
Since label information is often expensive to acquire, multi-view clustering has gained growing interest.
Error-robust multi-view clustering approaches with explicit error removal formulation can be structured into five broad research categories.
This survey summarizes and reviews recent advances in error-robust clustering for multi-view data.
arXiv Detail & Related papers (2021-05-07T04:03:02Z) - Uncorrelated Semi-paired Subspace Learning [7.20500993803316]
We propose a generalized uncorrelated multi-view subspace learning framework.
To showcase the flexibility of the framework, we instantiate five new semi-paired models for both unsupervised and semi-supervised learning.
Our proposed models perform competitively to or better than the baselines.
arXiv Detail & Related papers (2020-11-22T22:14:20Z) - Semi-Automatic Data Annotation guided by Feature Space Projection [117.9296191012968]
We present a semi-automatic data annotation approach based on suitable feature space projection and semi-supervised label estimation.
We validate our method on the popular MNIST dataset and on images of human intestinal parasites with and without fecal impurities.
Our results demonstrate the added-value of visual analytics tools that combine complementary abilities of humans and machines for more effective machine learning.
arXiv Detail & Related papers (2020-07-27T17:03:50Z) - Generative Partial Multi-View Clustering [133.36721417531734]
We propose a generative partial multi-view clustering model, named as GP-MVC, to address the incomplete multi-view problem.
First, multi-view encoder networks are trained to learn common low-dimensional representations, followed by a clustering layer to capture the consistent cluster structure across multiple views.
Second, view-specific generative adversarial networks are developed to generate the missing data of one view conditioning on the shared representation given by other views.
arXiv Detail & Related papers (2020-03-29T17:48:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.