RS-DGC: Exploring Neighborhood Statistics for Dynamic Gradient
Compression on Remote Sensing Image Interpretation
- URL: http://arxiv.org/abs/2312.17530v1
- Date: Fri, 29 Dec 2023 09:24:26 GMT
- Title: RS-DGC: Exploring Neighborhood Statistics for Dynamic Gradient
Compression on Remote Sensing Image Interpretation
- Authors: Weiying Xie, Zixuan Wang, Jitao Ma, Daixun Li, Yunsong Li
- Abstract summary: gradient sparsification has been validated as an effective gradient compression (GC) technique for reducing communication costs.
We propose a simple yet effective dynamic gradient compression scheme leveraging neighborhood statistics indicator for RS image interpretation, RS-DGC.
We achieve an accuracy improvement of 0.51% with more than 50 times communication compression on the NWPU-RESISC45 dataset.
- Score: 23.649838489244917
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Distributed deep learning has recently been attracting more attention in
remote sensing (RS) applications due to the challenges posed by the increased
amount of open data that are produced daily by Earth observation programs.
However, the high communication costs of sending model updates among multiple
nodes are a significant bottleneck for scalable distributed learning. Gradient
sparsification has been validated as an effective gradient compression (GC)
technique for reducing communication costs and thus accelerating the training
speed. Existing state-of-the-art gradient sparsification methods are mostly
based on the "larger-absolute-more-important" criterion, ignoring the
importance of small gradients, which is generally observed to affect the
performance. Inspired by informative representation of manifold structures from
neighborhood information, we propose a simple yet effective dynamic gradient
compression scheme leveraging neighborhood statistics indicator for RS image
interpretation, termed RS-DGC. We first enhance the interdependence between
gradients by introducing the gradient neighborhood to reduce the effect of
random noise. The key component of RS-DGC is a Neighborhood Statistical
Indicator (NSI), which can quantify the importance of gradients within a
specified neighborhood on each node to sparsify the local gradients before
gradient transmission in each iteration. Further, a layer-wise dynamic
compression scheme is proposed to track the importance changes of each layer in
real time. Extensive downstream tasks validate the superiority of our method in
terms of intelligent interpretation of RS images. For example, we achieve an
accuracy improvement of 0.51% with more than 50 times communication compression
on the NWPU-RESISC45 dataset using VGG-19 network.
Related papers
- DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image
Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments.
Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features.
Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z) - Low-Dimensional Gradient Helps Out-of-Distribution Detection [26.237034426573523]
We conduct a comprehensive investigation into leveraging the entirety of gradient information for OOD detection.
The primary challenge arises from the high dimensionality of gradients due to the large number of network parameters.
We propose performing linear dimension reduction on the gradient using a designated subspace.
This innovative technique enables us to obtain a low-dimensional representation of the gradient with minimal information loss.
arXiv Detail & Related papers (2023-10-26T05:28:32Z) - PREM: A Simple Yet Effective Approach for Node-Level Graph Anomaly
Detection [65.24854366973794]
Node-level graph anomaly detection (GAD) plays a critical role in identifying anomalous nodes from graph-structured data in domains such as medicine, social networks, and e-commerce.
We introduce a simple method termed PREprocessing and Matching (PREM for short) to improve the efficiency of GAD.
Our approach streamlines GAD, reducing time and memory consumption while maintaining powerful anomaly detection capabilities.
arXiv Detail & Related papers (2023-10-18T02:59:57Z) - GIFD: A Generative Gradient Inversion Method with Feature Domain
Optimization [52.55628139825667]
Federated Learning (FL) has emerged as a promising distributed machine learning framework to preserve clients' privacy.
Recent studies find that an attacker can invert the shared gradients and recover sensitive data against an FL system by leveraging pre-trained generative adversarial networks (GAN) as prior knowledge.
We propose textbfGradient textbfInversion over textbfFeature textbfDomains (GIFD), which disassembles the GAN model and searches the feature domains of the intermediate layers.
arXiv Detail & Related papers (2023-08-09T04:34:21Z) - Scaling Private Deep Learning with Low-Rank and Sparse Gradients [5.14780936727027]
We propose a framework that exploits the low-rank and sparse structure of neural networks to reduce the dimension of gradient updates.
A novel strategy is utilized to sparsify the gradients, resulting in low-dimensional, less noisy updates.
Empirical evaluation on natural language processing and computer vision tasks shows that our method outperforms other state-of-the-art baselines.
arXiv Detail & Related papers (2022-07-06T14:09:47Z) - Communication-Efficient Distributed SGD with Compressed Sensing [24.33697801661053]
We consider large scale distributed optimization over a set of edge devices connected to a central server.
Inspired by recent advances in federated learning, we propose a distributed gradient descent (SGD) type algorithm that exploits the sparsity of the gradient, when possible, to reduce communication burden.
We conduct theoretical analysis on the convergence of our algorithm in the presence of noise perturbation incurred by the communication channels, and also conduct numerical experiments to corroborate its effectiveness.
arXiv Detail & Related papers (2021-12-15T02:10:45Z) - Communication-Efficient Federated Learning via Quantized Compressed
Sensing [82.10695943017907]
The presented framework consists of gradient compression for wireless devices and gradient reconstruction for a parameter server.
Thanks to gradient sparsification and quantization, our strategy can achieve a higher compression ratio than one-bit gradient compression.
We demonstrate that the framework achieves almost identical performance with the case that performs no compression.
arXiv Detail & Related papers (2021-11-30T02:13:54Z) - An Efficient Statistical-based Gradient Compression Technique for
Distributed Training Systems [77.88178159830905]
Sparsity-Inducing Distribution-based Compression (SIDCo) is a threshold-based sparsification scheme that enjoys similar threshold estimation quality to deep gradient compression (DGC)
Our evaluation shows SIDCo speeds up training by up to 41:7%, 7:6%, and 1:9% compared to the no-compression baseline, Topk, and DGC compressors, respectively.
arXiv Detail & Related papers (2021-01-26T13:06:00Z) - Sparse Communication for Training Deep Networks [56.441077560085475]
Synchronous gradient descent (SGD) is the most common method used for distributed training of deep learning models.
In this algorithm, each worker shares its local gradients with others and updates the parameters using the average gradients of all workers.
We study several compression schemes and identify how three key parameters affect the performance.
arXiv Detail & Related papers (2020-09-19T17:28:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.