Condensing Graphs via One-Step Gradient Matching
- URL: http://arxiv.org/abs/2206.07746v1
- Date: Wed, 15 Jun 2022 18:20:01 GMT
- Title: Condensing Graphs via One-Step Gradient Matching
- Authors: Wei Jin, Xianfeng Tang, Haoming Jiang, Zheng Li, Danqing Zhang,
Jiliang Tang, Bin Ying
- Abstract summary: We propose a one-step gradient matching scheme, which performs gradient matching for only one single step without training the network weights.
Our theoretical analysis shows this strategy can generate synthetic graphs that lead to lower classification loss on real graphs.
In particular, we are able to reduce the dataset size by 90% while approximating up to 98% of the original performance.
- Score: 50.07587238142548
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As training deep learning models on large dataset takes a lot of time and
resources, it is desired to construct a small synthetic dataset with which we
can train deep learning models sufficiently. There are recent works that have
explored solutions on condensing image datasets through complex bi-level
optimization. For instance, dataset condensation (DC) matches network gradients
w.r.t. large-real data and small-synthetic data, where the network weights are
optimized for multiple steps at each outer iteration. However, existing
approaches have their inherent limitations: (1) they are not directly
applicable to graphs where the data is discrete; and (2) the condensation
process is computationally expensive due to the involved nested optimization.
To bridge the gap, we investigate efficient dataset condensation tailored for
graph datasets where we model the discrete graph structure as a probabilistic
model. We further propose a one-step gradient matching scheme, which performs
gradient matching for only one single step without training the network
weights. Our theoretical analysis shows this strategy can generate synthetic
graphs that lead to lower classification loss on real graphs. Extensive
experiments on various graph datasets demonstrate the effectiveness and
efficiency of the proposed method. In particular, we are able to reduce the
dataset size by 90% while approximating up to 98% of the original performance
and our method is significantly faster than multi-step gradient matching (e.g.
15x in CIFAR10 for synthesizing 500 graphs).
Related papers
- Teddy: Efficient Large-Scale Dataset Distillation via Taylor-Approximated Matching [74.75248610868685]
Teddy is a Taylor-approximated dataset distillation framework designed to handle large-scale dataset.
Teddy attains state-of-the-art efficiency and performance on the Tiny-ImageNet and original-sized ImageNet-1K dataset.
arXiv Detail & Related papers (2024-10-10T03:28:46Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - Efficiently Learning the Graph for Semi-supervised Learning [4.518012967046983]
We show how to learn the best graphs from the sparse families efficiently using the conjugate gradient method.
Our approach can also be used to learn the graph efficiently online with sub-linear regret, under mild smoothness assumptions.
We implement our approach and demonstrate significant ($sim$10-100x) speedups over prior work on semi-supervised learning with learned graphs on benchmark datasets.
arXiv Detail & Related papers (2023-06-12T13:22:06Z) - Learnable Graph Matching: A Practical Paradigm for Data Association [74.28753343714858]
We propose a general learnable graph matching method to address these issues.
Our method achieves state-of-the-art performance on several MOT datasets.
For image matching, our method outperforms state-of-the-art methods on a popular indoor dataset, ScanNet.
arXiv Detail & Related papers (2023-03-27T17:39:00Z) - Delving into Effective Gradient Matching for Dataset Condensation [13.75957901381024]
gradient matching method directly targets the training dynamics by matching the gradient when training on the original and synthetic datasets.
We propose to match the multi-level gradients to involve both intra-class and inter-class gradient information.
An overfitting-aware adaptive learning step strategy is also proposed to trim unnecessary optimization steps for algorithmic efficiency improvement.
arXiv Detail & Related papers (2022-07-30T21:31:10Z) - Optimal Propagation for Graph Neural Networks [51.08426265813481]
We propose a bi-level optimization approach for learning the optimal graph structure.
We also explore a low-rank approximation model for further reducing the time complexity.
arXiv Detail & Related papers (2022-05-06T03:37:00Z) - Convergent Boosted Smoothing for Modeling Graph Data with Tabular Node
Features [46.052312251801]
We propose a framework for iterating boosting with graph propagation steps.
Our approach is anchored in a principled meta loss function.
Across a variety of non-iid graph datasets, our method achieves comparable or superior performance.
arXiv Detail & Related papers (2021-10-26T04:53:12Z) - Learnable Graph Matching: Incorporating Graph Partitioning with Deep
Feature Learning for Multiple Object Tracking [58.30147362745852]
Data association across frames is at the core of Multiple Object Tracking (MOT) task.
Existing methods mostly ignore the context information among tracklets and intra-frame detections.
We propose a novel learnable graph matching method to address these issues.
arXiv Detail & Related papers (2021-03-30T08:58:45Z) - Quantizing data for distributed learning [24.46948464551684]
We consider machine learning applications that train a model by leveraging data over a network, where communication constraints can create a performance bottleneck.
A number of recent approaches propose to overcome this bottleneck through compression of updates, but as models become larger, so does the size of the dataset.
In paper, we propose that quantizes data instead of over gradient updates and can support learning applications.
arXiv Detail & Related papers (2020-12-14T19:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.