Abstract: Federated Learning (FL) coordinates with numerous heterogeneous devices to
collaboratively train a shared model while preserving user privacy. Despite its
multiple advantages, FL faces new challenges. One challenge arises when devices
drop out of the training process beyond the control of the central server. In
this case, the convergence of popular FL algorithms such as FedAvg is severely
influenced by the straggling devices. To tackle this challenge, we study
federated learning algorithms under arbitrary device unavailability and propose
an algorithm named Memory-augmented Impatient Federated Averaging (MIFA). Our
algorithm efficiently avoids excessive latency induced by inactive devices, and
corrects the gradient bias using the memorized latest updates from the devices.
We prove that MIFA achieves minimax optimal convergence rates on non-i.i.d.
data for both strongly convex and non-convex smooth functions. We also provide
an explicit characterization of the improvement over baseline algorithms
through a case study, and validate the results by numerical experiments on