Related papers: Clustering-Based Validation Splits for Domain Generalisation

Clustering-Based Validation Splits for Domain Generalisation

URL: http://arxiv.org/abs/2405.19461v1
Date: Wed, 29 May 2024 19:21:17 GMT
Title: Clustering-Based Validation Splits for Domain Generalisation
Authors: Andrea Napoli, Paul White,
Abstract summary: It is proposed that a high maximum mean discrepancy (MMD) between the training and validation sets increases the generalisability of selected models. A data splitting algorithm based on kernel k-means clustering is presented.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: This paper considers the problem of model selection under domain shift. In this setting, it is proposed that a high maximum mean discrepancy (MMD) between the training and validation sets increases the generalisability of selected models. A data splitting algorithm based on kernel k-means clustering, which maximises this objective, is presented. The algorithm leverages linear programming to control the size, label, and (optionally) group distributions of the splits, and comes with convergence guarantees. The technique consistently outperforms alternative splitting strategies across a range of datasets and training algorithms, for both domain generalisation (DG) and unsupervised domain adaptation (UDA) tasks. Analysis also shows the MMD between the training and validation sets to be strongly rank-correlated ($\rho=0.63$) with test domain accuracy, further substantiating the validity of this approach.

Related papers

Distributed Linear Regression with Compositional Covariates [5.085889377571319]
We focus on the distributed sparse penalized linear log-contrast model in massive compositional data. Two distributed optimization techniques are proposed for solving the two different constrained convex optimization problems. In the decentralized topology, we introduce a distributed coordinate-wise descent algorithm for obtaining a communication-efficient regularized estimation.
arXiv Detail & Related papers (2023-10-21T11:09:37Z)
Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains. We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z)
Instance Level Affinity-Based Transfer for Unsupervised Domain Adaptation [74.71931918541748]
We propose an instance affinity based criterion for source to target transfer during adaptation, called ILA-DA. We first propose a reliable and efficient method to extract similar and dissimilar samples across source and target, and utilize a multi-sample contrastive loss to drive the domain alignment process. We verify the effectiveness of ILA-DA by observing consistent improvements in accuracy over popular domain adaptation approaches on a variety of benchmark datasets.
arXiv Detail & Related papers (2021-04-03T01:33:14Z)
Mixup Regularized Adversarial Networks for Multi-Domain Text Classification [16.229317527580072]
Using the shared-private paradigm and adversarial training has significantly improved the performances of multi-domain text classification (MDTC) models. However, there are two issues for the existing methods. We propose a mixup regularized adversarial network (MRAN) to address these two issues.
arXiv Detail & Related papers (2021-01-31T15:24:05Z)
Cross-Domain Grouping and Alignment for Domain Adaptive Semantic Segmentation [74.3349233035632]
Existing techniques to adapt semantic segmentation networks across the source and target domains within deep convolutional neural networks (CNNs) do not consider an inter-class variation within the target domain itself or estimated category. We introduce a learnable clustering module, and a novel domain adaptation framework called cross-domain grouping and alignment. Our method consistently boosts the adaptation performance in semantic segmentation, outperforming the state-of-the-arts on various domain adaptation settings.
arXiv Detail & Related papers (2020-12-15T11:36:21Z)
Towards Uncovering the Intrinsic Data Structures for Unsupervised Domain Adaptation using Structurally Regularized Deep Clustering [119.88565565454378]
Unsupervised domain adaptation (UDA) is to learn classification models that make predictions for unlabeled data on a target domain. We propose a hybrid model of Structurally Regularized Deep Clustering, which integrates the regularized discriminative clustering of target data with a generative one. Our proposed H-SRDC outperforms all the existing methods under both the inductive and transductive settings.
arXiv Detail & Related papers (2020-12-08T08:52:00Z)
Learning Invariant Representations and Risks for Semi-supervised Domain Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA) We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z)
Contradictory Structure Learning for Semi-supervised Domain Adaptation [67.89665267469053]
Current adversarial adaptation methods attempt to align the cross-domain features. Two challenges remain unsolved: 1) the conditional distribution mismatch and 2) the bias of the decision boundary towards the source domain. We propose a novel framework for semi-supervised domain adaptation by unifying the learning of opposite structures.
arXiv Detail & Related papers (2020-02-06T22:58:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.