Related papers: Local Clustering with Mean Teacher for Semi-supervised Learning

Local Clustering with Mean Teacher for Semi-supervised Learning

URL: http://arxiv.org/abs/2004.09665v2
Date: Fri, 24 Jul 2020 00:47:01 GMT
Title: Local Clustering with Mean Teacher for Semi-supervised Learning
Authors: Zexi Chen, Benjamin Dutton, Bharathkumar Ramachandra, Tianfu Wu, Ranga Raju Vatsavai
Abstract summary: Mean Teacher (MT) model of Tarvainen and Valpola has shown favorable performance on several semi-supervised benchmark datasets. We propose a simple yet effective method called Local Clustering (LC) to mitigate the effect of confirmation bias.
Score: 8.54739245813914
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Mean Teacher (MT) model of Tarvainen and Valpola has shown favorable performance on several semi-supervised benchmark datasets. MT maintains a teacher model's weights as the exponential moving average of a student model's weights and minimizes the divergence between their probability predictions under diverse perturbations of the inputs. However, MT is known to suffer from confirmation bias, that is, reinforcing incorrect teacher model predictions. In this work, we propose a simple yet effective method called Local Clustering (LC) to mitigate the effect of confirmation bias. In MT, each data point is considered independent of other points during training; however, data points are likely to be close to each other in feature space if they share similar features. Motivated by this, we cluster data points locally by minimizing the pairwise distance between neighboring data points in feature space. Combined with a standard classification cross-entropy objective on labeled data points, the misclassified unlabeled data points are pulled towards high-density regions of their correct class with the help of their neighbors, thus improving model performance. We demonstrate on semi-supervised benchmark datasets SVHN and CIFAR-10 that adding our LC loss to MT yields significant improvements compared to MT and performance comparable to the state of the art in semi-supervised learning.

Related papers

Merlin: Multi-View Representation Learning for Robust Multivariate Time Series Forecasting with Unfixed Missing Rates [27.758665158820598]
We propose Multi-View Representation Learning (Merlin), which can help existing models achieve semantic alignment between incomplete observations with different missing rates.<n>Merlin is capable of effectively enhancing the robustness of existing models against unfixed missing rates while preserving forecasting accuracy.
arXiv Detail & Related papers (2025-06-14T11:55:18Z)
Optimal Transport-based Domain Alignment as a Preprocessing Step for Federated Learning [0.48342038441006796]
Federated learning (FL) is a subfield of machine learning that avoids sharing local data with a central server.<n>In FL, fusing locally-trained models with unbalanced datasets may deteriorate the performance of global model aggregation.<n>We introduce an Optimal Transport-based preprocessing algorithm that aligns the datasets by minimizing the distributional discrepancy of data along the edge devices.
arXiv Detail & Related papers (2025-06-04T15:35:55Z)
Unbiased Max-Min Embedding Classification for Transductive Few-Shot Learning: Clustering and Classification Are All You Need [83.10178754323955]
Few-shot learning enables models to generalize from only a few labeled examples. We propose the Unbiased Max-Min Embedding Classification (UMMEC) Method, which addresses the key challenges in few-shot learning. Our method significantly improves classification performance with minimal labeled data, advancing the state-of-the-art in annotatedL.
arXiv Detail & Related papers (2025-03-28T07:23:07Z)
Interaction-Aware Gaussian Weighting for Clustered Federated Learning [58.92159838586751]
Federated Learning (FL) emerged as a decentralized paradigm to train models while preserving privacy. We propose a novel clustered FL method, FedGWC (Federated Gaussian Weighting Clustering), which groups clients based on their data distribution. Our experiments on benchmark datasets show that FedGWC outperforms existing FL algorithms in cluster quality and classification accuracy.
arXiv Detail & Related papers (2025-02-05T16:33:36Z)
FedDistill: Global Model Distillation for Local Model De-Biasing in Non-IID Federated Learning [10.641875933652647]
Federated Learning (FL) is a novel approach that allows for collaborative machine learning. FL faces challenges due to non-uniformly distributed (non-iid) data across clients. This paper introduces FedDistill, a framework enhancing the knowledge transfer from the global model to local models.
arXiv Detail & Related papers (2024-04-14T10:23:30Z)
Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding. The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data. We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z)
Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice. We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z)
Federated Skewed Label Learning with Logits Fusion [23.062650578266837]
Federated learning (FL) aims to collaboratively train a shared model across multiple clients without transmitting their local data. We propose FedBalance, which corrects the optimization bias among local models by calibrating their logits. Our method can gain 13% higher average accuracy compared with state-of-the-art methods.
arXiv Detail & Related papers (2023-11-14T14:37:33Z)
Integrating Local Real Data with Global Gradient Prototypes for Classifier Re-Balancing in Federated Long-Tailed Learning [60.41501515192088]
Federated Learning (FL) has become a popular distributed learning paradigm that involves multiple clients training a global model collaboratively. The data samples usually follow a long-tailed distribution in the real world, and FL on the decentralized and long-tailed data yields a poorly-behaved global model. In this work, we integrate the local real data with the global gradient prototypes to form the local balanced datasets.
arXiv Detail & Related papers (2023-01-25T03:18:10Z)
Federated Learning with Label Distribution Skew via Logits Calibration [26.98248192651355]
In this paper, we investigate the label distribution skew in FL, where the distribution of labels varies across clients. We propose FedLC, which calibrates the logits before softmax cross-entropy according to the probability of occurrence of each class. Experiments on federated datasets and real-world datasets demonstrate that FedLC leads to a more accurate global model.
arXiv Detail & Related papers (2022-09-01T02:56:39Z)
A semi-supervised Teacher-Student framework for surgical tool detection and localization [2.41710192205034]
We introduce a semi-supervised learning (SSL) framework in surgical tool detection paradigm. In the proposed work, we train a model with labeled data which initialises the Teacher-Student joint learning. Our results on m2cai16-tool-locations dataset indicate the superiority of our approach on different supervised data settings.
arXiv Detail & Related papers (2022-08-21T17:21:31Z)
Contrastive Neighborhood Alignment [81.65103777329874]
We present Contrastive Neighborhood Alignment (CNA), a manifold learning approach to maintain the topology of learned features. The target model aims to mimic the local structure of the source representation space using a contrastive loss. CNA is illustrated in three scenarios: manifold learning, where the model maintains the local topology of the original data in a dimension-reduced space; model distillation, where a small student model is trained to mimic a larger teacher; and legacy model update, where an older model is replaced by a more powerful one.
arXiv Detail & Related papers (2022-01-06T04:58:31Z)
Unsupervised Feature Learning by Cross-Level Instance-Group Discrimination [68.83098015578874]
We integrate between-instance similarity into contrastive learning, not directly by instance grouping, but by cross-level discrimination. CLD effectively brings unsupervised learning closer to natural data and real-world applications. New state-of-the-art on self-supervision, semi-supervision, and transfer learning benchmarks, and beats MoCo v2 and SimCLR on every reported performance.
arXiv Detail & Related papers (2020-08-09T21:13:13Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision [5.352699766206807]
Active learning (AL) aims to minimize labeling efforts for data-demanding deep neural networks (DNNs) We propose a low-complexity method for feature density matching using self-supervised Fisher kernel (FK) Our method outperforms state-of-the-art methods on MNIST, SVHN, and ImageNet classification while requiring only 1/10th of processing.
arXiv Detail & Related papers (2020-03-01T03:56:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.