Nearest Neighborhood-Based Deep Clustering for Source Data-absent
Unsupervised Domain Adaptation
- URL: http://arxiv.org/abs/2107.12585v1
- Date: Tue, 27 Jul 2021 04:13:59 GMT
- Title: Nearest Neighborhood-Based Deep Clustering for Source Data-absent
Unsupervised Domain Adaptation
- Authors: Song Tang, Yan Yang, Zhiyuan Ma, Norman Hendrich, Fanyu Zeng, Shuzhi
Sam Ge, Changshui Zhang, Jianwei Zhang
- Abstract summary: In the classic setting of unsupervised domain adaptation (UDA), the labeled source data are available in the training phase.
In many real-world scenarios, the source data is inaccessible, and only a model trained on the source domain is available.
This paper proposes a novel deep clustering method for this challenging task.
- Score: 33.394228127643494
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the classic setting of unsupervised domain adaptation (UDA), the labeled
source data are available in the training phase. However, in many real-world
scenarios, owing to some reasons such as privacy protection and information
security, the source data is inaccessible, and only a model trained on the
source domain is available. This paper proposes a novel deep clustering method
for this challenging task. Aiming at the dynamical clustering at feature-level,
we introduce extra constraints hidden in the geometric structure between data
to assist the process. Concretely, we propose a geometry-based constraint,
named semantic consistency on the nearest neighborhood (SCNNH), and use it to
encourage robust clustering. To reach this goal, we construct the nearest
neighborhood for every target data and take it as the fundamental clustering
unit by building our objective on the geometry. Also, we develop a more
SCNNH-compliant structure with an additional semantic credibility constraint,
named semantic hyper-nearest neighborhood (SHNNH). After that, we extend our
method to this new geometry. Extensive experiments on three challenging UDA
datasets indicate that our method achieves state-of-the-art results. The
proposed method has significant improvement on all datasets (as we adopt SHNNH,
the average accuracy increases by over 3.0\% on the large-scaled dataset). Code
is available at https://github.com/tntek/N2DCX.
Related papers
- Unimodal Strategies in Density-Based Clustering [15.581610184349731]
We reveal a key property intrinsic to density-based clustering methods regarding the relation between the number of clusters and the neighborhood radius of core points.<n>We leverage this property to devise new strategies for finding appropriate values for the radius more efficiently based on the Ternary Search algorithm.<n>We validate our methodology through extensive applications across a range of high-dimensional, large-scale NLP, Audio, and Computer Vision tasks.
arXiv Detail & Related papers (2025-06-26T18:25:14Z) - Topology-Aware Modeling for Unsupervised Simulation-to-Reality Point Cloud Recognition [63.55828203989405]
We introduce a novel Topology-Aware Modeling (TAM) framework for Sim2Real UDA on object point clouds.<n>Our approach mitigates the domain gap by leveraging global spatial topology, characterized by low-level, high-frequency 3D structures.<n>We propose an advanced self-training strategy that combines cross-domain contrastive learning with self-training.
arXiv Detail & Related papers (2025-06-26T11:53:59Z) - Adaptive and Robust DBSCAN with Multi-agent Reinforcement Learning [53.527506374566485]
We propose a novel Adaptive and Robust DBSCAN with Multi-agent Reinforcement Learning cluster framework, namely AR-DBSCAN.<n>We show that AR-DBSCAN not only improves clustering accuracy by up to 144.1% and 175.3% in the NMI and ARI metrics, respectively, but also is capable of robustly finding dominant parameters.
arXiv Detail & Related papers (2025-05-07T11:37:23Z) - Hyperoctant Search Clustering: A Method for Clustering Data in High-Dimensional Hyperspheres [0.0]
We propose a new clustering method based on a topological approach applied to regions of space defined by signs of coordinates (hyperoctants)
According to a density criterion, the method builds clusters of data points based on the partitioning of a graph.
We choose the application of topic detection, which is an important task in text mining.
arXiv Detail & Related papers (2025-03-10T23:41:44Z) - Trust your Good Friends: Source-free Domain Adaptation by Reciprocal
Neighborhood Clustering [50.46892302138662]
We address the source-free domain adaptation problem, where the source pretrained model is adapted to the target domain in the absence of source data.
Our method is based on the observation that target data, which might not align with the source domain classifier, still forms clear clusters.
We demonstrate that this local structure can be efficiently captured by considering the local neighbors, the reciprocal neighbors, and the expanded neighborhood.
arXiv Detail & Related papers (2023-09-01T15:31:18Z) - Hard Regularization to Prevent Deep Online Clustering Collapse without
Data Augmentation [65.268245109828]
Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed.
While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster.
We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments.
arXiv Detail & Related papers (2023-03-29T08:23:26Z) - Nearest Neighbor-Based Contrastive Learning for Hyperspectral and LiDAR
Data Classification [45.026868970899514]
We propose a Nearest Neighbor-based Contrastive Learning Network (NNCNet) to learn discriminative feature representations.
Specifically, we propose a nearest neighbor-based data augmentation scheme to use enhanced semantic relationships among nearby regions.
In addition, we design a bilinear attention module to exploit the second-order and even high-order feature interactions between the HSI and LiDAR data.
arXiv Detail & Related papers (2023-01-09T13:43:54Z) - Bi-level Alignment for Cross-Domain Crowd Counting [113.78303285148041]
Current methods rely on external data for training an auxiliary task or apply an expensive coarse-to-fine estimation.
We develop a new adversarial learning based method, which is simple and efficient to apply.
We evaluate our approach on five real-world crowd counting benchmarks, where we outperform existing approaches by a large margin.
arXiv Detail & Related papers (2022-05-12T02:23:25Z) - Exploiting the Intrinsic Neighborhood Structure for Source-free Domain
Adaptation [47.907168218249694]
We address the source-free domain adaptation problem, where the source pretrained model is adapted to the target domain in the absence of source data.
We capture this intrinsic structure by defining local affinity of the target data, and encourage label consistency among data with high local affinity.
We demonstrate that this local structure can be efficiently captured by considering the local neighbors, the reciprocal neighbors, and the expanded neighborhood.
arXiv Detail & Related papers (2021-10-08T15:40:18Z) - Index $t$-SNE: Tracking Dynamics of High-Dimensional Datasets with
Coherent Embeddings [1.7188280334580195]
This paper presents a methodology to reuse an embedding to create a new one, where cluster positions are preserved.
The proposed algorithm has the same complexity as the original $t$-SNE to embed new items, and a lower one when considering the embedding of a dataset sliced into sub-pieces.
arXiv Detail & Related papers (2021-09-22T06:45:37Z) - Towards Uncovering the Intrinsic Data Structures for Unsupervised Domain
Adaptation using Structurally Regularized Deep Clustering [119.88565565454378]
Unsupervised domain adaptation (UDA) is to learn classification models that make predictions for unlabeled data on a target domain.
We propose a hybrid model of Structurally Regularized Deep Clustering, which integrates the regularized discriminative clustering of target data with a generative one.
Our proposed H-SRDC outperforms all the existing methods under both the inductive and transductive settings.
arXiv Detail & Related papers (2020-12-08T08:52:00Z) - Overcomplete Deep Subspace Clustering Networks [80.16644725886968]
Experimental results on four benchmark datasets show the effectiveness of the proposed method over DSC and other clustering methods in terms of clustering error.
Our method is also not as dependent as DSC is on where pre-training should be stopped to get the best performance and is also more robust to noise.
arXiv Detail & Related papers (2020-11-16T22:07:18Z) - CycleCluster: Modernising Clustering Regularisation for Deep
Semi-Supervised Classification [0.0]
We propose a novel framework, CycleCluster, for deep semi-supervised classification.
Our core optimisation is driven by a new clustering based regularisation along with a graph based pseudo-labels and a shared deep network.
arXiv Detail & Related papers (2020-01-15T13:34:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.