Related papers: Adaptive Transfer Clustering: A Unified Framework

Adaptive Transfer Clustering: A Unified Framework

URL: http://arxiv.org/abs/2410.21263v3
Date: Fri, 15 Nov 2024 04:32:55 GMT
Title: Adaptive Transfer Clustering: A Unified Framework
Authors: Yuqi Gu, Zhongyuan Lyu, Kaizheng Wang,
Abstract summary: We propose an adaptive transfer clustering (ATC) algorithm that automatically leverages the commonality in the presence of unknown discrepancy. It applies to a broad class of statistical models including Gaussian mixture models, block models, and latent class models.
Score: 2.3144964550307496
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a general transfer learning framework for clustering given a main dataset and an auxiliary one about the same subjects. The two datasets may reflect similar but different latent grouping structures of the subjects. We propose an adaptive transfer clustering (ATC) algorithm that automatically leverages the commonality in the presence of unknown discrepancy, by optimizing an estimated bias-variance decomposition. It applies to a broad class of statistical models including Gaussian mixture models, stochastic block models, and latent class models. A theoretical analysis proves the optimality of ATC under the Gaussian mixture model and explicitly quantifies the benefit of transfer. Extensive simulations and real data experiments confirm our method's effectiveness in various scenarios.

Related papers

Cluster-Based Generalized Additive Models Informed by Random Fourier Features [19.409397281817288]
This work introduces a mixture of generalized additive models (GAMs) in which random Fourier feature (RFF) representations are leveraged to uncover locally adaptive structure in the data.<n> Numerical experiments on real-world regression benchmarks, including the California Housing, NASA Air Self-Noise, and Bike Sharing datasets, demonstrate improved predictive performance.
arXiv Detail & Related papers (2025-12-22T13:15:52Z)
A Deterministic Information Bottleneck Method for Clustering Mixed-Type Data [0.0]
We present an information-theoretic method for clustering mixed-type data, that is, data consisting of both continuous and categorical variables.<n>The proposed approach extends the Information Bottleneck principle to heterogeneous data through generalised product kernels.<n>We demonstrate that the proposed method, named DIBmix, achieves superior performance compared to four established methods.
arXiv Detail & Related papers (2024-07-03T09:06:19Z)
Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models [83.02797560769285]
Data-Free Meta-Learning (DFML) aims to derive knowledge from a collection of pre-trained models without accessing their original data. Current methods often overlook the heterogeneity among pre-trained models, which leads to performance degradation due to task conflicts. We propose Task Groupings Regularization, a novel approach that benefits from model heterogeneity by grouping and aligning conflicting tasks.
arXiv Detail & Related papers (2024-05-26T13:11:55Z)
Self Supervised Correlation-based Permutations for Multi-View Clustering [7.093692674858257]
We propose an end-to-end deep learning-based multi-view clustering framework for general data types.<n>Our approach involves generating meaningful fused representations using a novel permutation-based canonical correlation objective.
arXiv Detail & Related papers (2024-02-26T08:08:30Z)
Lp-Norm Constrained One-Class Classifier Combination [18.27510863075184]
We consider the one-class classification problem by modelling the sparsity/uniformity of the ensemble. We present an effective approach to solve formulated convex constrained problem efficiently.
arXiv Detail & Related papers (2023-12-25T16:32:34Z)
Finite Mixtures of Multivariate Poisson-Log Normal Factor Analyzers for Clustering Count Data [0.8499685241219366]
A class of eight parsimonious mixture models based on the mixtures of factor analyzers model are introduced. The proposed models are explored in the context of clustering discrete data arising from RNA sequencing studies.
arXiv Detail & Related papers (2023-11-13T21:23:15Z)
Consistency Regularization for Generalizable Source-free Domain Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset. Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets. We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z)
Accuracy on the Line: On the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization [89.73665256847858]
We show that out-of-distribution performance is strongly correlated with in-distribution performance for a wide range of models and distribution shifts. Specifically, we demonstrate strong correlations between in-distribution and out-of-distribution performance on variants of CIFAR-10 & ImageNet. We also investigate cases where the correlation is weaker, for instance some synthetic distribution shifts from CIFAR-10-C and the tissue classification dataset Camelyon17-WILDS.
arXiv Detail & Related papers (2021-07-09T19:48:23Z)
Deep Conditional Gaussian Mixture Model for Constrained Clustering [7.070883800886882]
Constrained clustering can leverage prior information on a growing amount of only partially labeled data. We propose a novel framework for constrained clustering that is intuitive, interpretable, and can be trained efficiently in the framework of gradient variational inference.
arXiv Detail & Related papers (2021-06-11T13:38:09Z)
Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously. We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework. The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z)
Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference. We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)
Semi-nonparametric Latent Class Choice Model with a Flexible Class Membership Component: A Mixture Model Approach [6.509758931804479]
The proposed model formulates the latent classes using mixture models as an alternative approach to the traditional random utility specification. Results show that mixture models improve the overall performance of latent class choice models.
arXiv Detail & Related papers (2020-07-06T13:19:26Z)
Ensemble Model with Batch Spectral Regularization and Data Blending for Cross-Domain Few-Shot Learning with Unlabeled Data [75.94147344921355]
We build a multi-branch ensemble framework by using diverse feature transformation matrices. We propose a data blending method to exploit the unlabeled data and augment the sparse support set in the target domain.
arXiv Detail & Related papers (2020-06-08T02:27:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.