Improving the Robustness of Federated Learning for Severely Imbalanced
Datasets
- URL: http://arxiv.org/abs/2204.13414v1
- Date: Thu, 28 Apr 2022 11:23:42 GMT
- Title: Improving the Robustness of Federated Learning for Severely Imbalanced
Datasets
- Authors: Debasrita Chakraborty and Ashish Ghosh
- Abstract summary: Two common approaches to achieve this distributed learning is synchronous and asynchronous weight update.
It has been seen that with an increasing number of worker nodes, the performance degrades drastically.
This effect has been studied in the context of extreme imbalanced classification.
- Score: 11.498089180181365
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the ever increasing data deluge and the success of deep neural networks,
the research of distributed deep learning has become pronounced. Two common
approaches to achieve this distributed learning is synchronous and asynchronous
weight update. In this manuscript, we have explored very simplistic synchronous
weight update mechanisms. It has been seen that with an increasing number of
worker nodes, the performance degrades drastically. This effect has been
studied in the context of extreme imbalanced classification (e.g. outlier
detection). In practical cases, the assumed conditions of i.i.d. may not be
fulfilled. There may also arise global class imbalance situations like that of
outlier detection where the local servers receive severely imbalanced data and
may not get any samples from the minority class. In that case, the DNNs in the
local servers will get completely biased towards the majority class that they
receive. This would highly impact the learning at the parameter server (which
practically does not see any data). It has been observed that in a parallel
setting if one uses the existing federated weight update mechanisms at the
parameter server, the performance degrades drastically with the increasing
number of worker nodes. This is mainly because, with the increasing number of
nodes, there is a high chance that one worker node gets a very small portion of
the data, either not enough to train the model without overfitting or having a
highly imbalanced class distribution. The chapter, hence, proposes a workaround
to this problem by introducing the concept of adaptive cost-sensitive momentum
averaging. It is seen that for the proposed system, there was no to minimal
degradation in performance while most of the other methods hit their bottom
performance before that.
Related papers
- Open-World Semi-Supervised Learning for Node Classification [53.07866559269709]
Open-world semi-supervised learning (Open-world SSL) for node classification is a practical but under-explored problem in the graph community.
We propose an IMbalance-Aware method named OpenIMA for Open-world semi-supervised node classification.
arXiv Detail & Related papers (2024-03-18T05:12:54Z) - Enhancing Consistency and Mitigating Bias: A Data Replay Approach for Incremental Learning [93.90047628101155]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.
To address this, some methods propose replaying data from previous tasks during new task learning.
However, it is not expected in practice due to memory constraints and data privacy issues.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - Inducing Neural Collapse in Deep Long-tailed Learning [13.242721780822848]
We propose two explicit feature regularization terms to learn high-quality representation for class-imbalanced data.
With the proposed regularization, Neural Collapse phenomena will appear under the class-imbalanced distribution.
Our method is easily implemented, highly effective, and can be plugged into most existing methods.
arXiv Detail & Related papers (2023-02-24T05:07:05Z) - ReGrAt: Regularization in Graphs using Attention to handle class
imbalance [14.322295231579073]
In this work, we study how attention networks can help tackle imbalance in node classification.
We also observe that using a regularizer to assign larger weights to minority nodes helps to mitigate this imbalance.
We achieve State of the Art results than the existing methods on several standard citation benchmark datasets.
arXiv Detail & Related papers (2022-11-27T09:04:29Z) - An Embarrassingly Simple Baseline for Imbalanced Semi-Supervised
Learning [103.65758569417702]
Semi-supervised learning (SSL) has shown great promise in leveraging unlabeled data to improve model performance.
We consider a more realistic and challenging setting called imbalanced SSL, where imbalanced class distributions occur in both labeled and unlabeled data.
We study a simple yet overlooked baseline -- SimiS -- which tackles data imbalance by simply supplementing labeled data with pseudo-labels.
arXiv Detail & Related papers (2022-11-20T21:18:41Z) - Synthetic Over-sampling for Imbalanced Node Classification with Graph
Neural Networks [34.81248024048974]
Graph neural networks (GNNs) have achieved state-of-the-art performance for node classification.
In many real-world scenarios, node classes are imbalanced, with some majority classes making up most parts of the graph.
In this work, we seek to address this problem by generating pseudo instances of minority classes to balance the training data.
arXiv Detail & Related papers (2022-06-10T19:47:05Z) - Distributionally Robust Semi-Supervised Learning Over Graphs [68.29280230284712]
Semi-supervised learning (SSL) over graph-structured data emerges in many network science applications.
To efficiently manage learning over graphs, variants of graph neural networks (GNNs) have been developed recently.
Despite their success in practice, most of existing methods are unable to handle graphs with uncertain nodal attributes.
Challenges also arise due to distributional uncertainties associated with data acquired by noisy measurements.
A distributionally robust learning framework is developed, where the objective is to train models that exhibit quantifiable robustness against perturbations.
arXiv Detail & Related papers (2021-10-20T14:23:54Z) - Class Balancing GAN with a Classifier in the Loop [58.29090045399214]
We introduce a novel theoretically motivated Class Balancing regularizer for training GANs.
Our regularizer makes use of the knowledge from a pre-trained classifier to ensure balanced learning of all the classes in the dataset.
We demonstrate the utility of our regularizer in learning representations for long-tailed distributions via achieving better performance than existing approaches over multiple datasets.
arXiv Detail & Related papers (2021-06-17T11:41:30Z) - Fed-Focal Loss for imbalanced data classification in Federated Learning [2.2172881631608456]
Federated Learning has a central server coordinating the training of a model on a network of devices.
One of the challenges is variable training performance when the dataset has a class imbalance.
We propose to address the class imbalance by reshaping cross-entropy loss such that it down-weights the loss assigned to well-classified examples along the lines of focal loss.
arXiv Detail & Related papers (2020-11-12T09:52:14Z) - Superiority of Simplicity: A Lightweight Model for Network Device
Workload Prediction [58.98112070128482]
We propose a lightweight solution for series prediction based on historic observations.
It consists of a heterogeneous ensemble method composed of two models - a neural network and a mean predictor.
It achieves an overall $R2$ score of 0.10 on the available FedCSIS 2020 challenge dataset.
arXiv Detail & Related papers (2020-07-07T15:44:16Z) - Imbalanced Data Learning by Minority Class Augmentation using Capsule
Adversarial Networks [31.073558420480964]
We propose a method to restore the balance in imbalanced images, by coalescing two concurrent methods.
In our model, generative and discriminative networks play a novel competitive game.
The coalescing of capsule-GAN is effective at recognizing highly overlapping classes with much fewer parameters compared with the convolutional-GAN.
arXiv Detail & Related papers (2020-04-05T12:36:06Z) - Identifying and Compensating for Feature Deviation in Imbalanced Deep
Learning [59.65752299209042]
We investigate learning a ConvNet under such a scenario.
We found that a ConvNet significantly over-fits the minor classes.
We propose to incorporate class-dependent temperatures (CDT) training ConvNet.
arXiv Detail & Related papers (2020-01-06T03:52:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.