Learning to Learn Weight Generation via Local Consistency Diffusion
- URL: http://arxiv.org/abs/2502.01117v3
- Date: Mon, 19 May 2025 05:44:58 GMT
- Title: Learning to Learn Weight Generation via Local Consistency Diffusion
- Authors: Yunchuan Guan, Yu Liu, Ke Zhou, Zhiqi Shen, Jenq-Neng Hwang, Lei Li,
- Abstract summary: We propose Mc-Di, which integrates the diffusion algorithm with meta-learning for better generalizability.<n>Our theory and experiments demonstrate that it can learn from local targets while maintaining consistency with the global optima.<n>We validate Mc-Di's superior accuracy and inference efficiency in tasks that require frequent weight updates.
- Score: 30.945131775280792
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion-based algorithms have emerged as promising techniques for weight generation. However, existing solutions are limited by two challenges: generalizability and local target assignment. The former arises from the inherent lack of cross-task transferability in existing single-level optimization methods, limiting the model's performance on new tasks. The latter lies in existing research modeling only global optimal weights, neglecting the supervision signals in local target weights. Moreover, naively assigning local target weights causes local-global inconsistency. To address these issues, we propose Mc-Di, which integrates the diffusion algorithm with meta-learning for better generalizability. Furthermore, we extend the vanilla diffusion into a local consistency diffusion algorithm. Our theory and experiments demonstrate that it can learn from local targets while maintaining consistency with the global optima. We validate Mc-Di's superior accuracy and inference efficiency in tasks that require frequent weight updates, including transfer learning, few-shot learning, domain generalization, and large language model adaptation.
Related papers
- Morphology-optimized Multi-Scale Fusion: Combining Local Artifacts and Mesoscopic Semantics for Deepfake Detection and Localization [30.871239863769404]
A common strategy is to incorporate forged region annotations during model training alongside manipulated images.<n>We propose a novel approach that independently predicts manipulated regions using both local and global perspectives.
arXiv Detail & Related papers (2025-09-17T07:46:07Z) - A Novel Local Focusing Mechanism for Deepfake Detection Generalization [10.223643897131192]
Deepfake generation techniques have intensified the need for robust and generalizable detection methods.<n>We propose a novel Local Focus Mechanism (LFM) that explicitly attends to discriminative local features for differentiating fake from real images.<n>LFM achieves a 3.7 improvement in accuracy and a 2.8 increase in average precision over the state-of-the-art Neighboring Pixel Relationships (NPR) method.
arXiv Detail & Related papers (2025-08-23T14:06:30Z) - DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization [22.546989373687655]
We propose a novel pruning method that derives an efficient diffusion model via a more intelligent and differentiable pruner.
Our approach achieves 4.4 x speedup for SD-1.5 without any loss of accuracy, significantly outperforming the previous state-of-the-art methods.
arXiv Detail & Related papers (2024-10-22T12:18:24Z) - Interactive incremental learning of generalizable skills with local trajectory modulation [14.416251854298409]
We propose an interactive imitation learning framework that simultaneously leverages local and global modulations of trajectory distributions.<n>Our approach exploits the concept of via-points to incrementally and interactively 1) improve the model accuracy locally, 2) add new objects to the task during execution and 3) extend the skill into regions where demonstrations were not provided.
arXiv Detail & Related papers (2024-09-09T14:22:19Z) - DiffSG: A Generative Solver for Network Optimization with Diffusion Model [75.27274046562806]
Generative diffusion models are popular in various cross-domain applications.<n>These models hold promise in tackling complex network optimization problems.<n>We propose a new framework for generative diffusion models called Diffusion Model-based Solution Generation.
arXiv Detail & Related papers (2024-08-13T07:56:21Z) - Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review [63.31328039424469]
This tutorial provides a comprehensive survey of methods for fine-tuning diffusion models to optimize downstream reward functions.
We explain the application of various RL algorithms, including PPO, differentiable optimization, reward-weighted MLE, value-weighted sampling, and path consistency learning.
arXiv Detail & Related papers (2024-07-18T17:35:32Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Training Artificial Neural Networks by Coordinate Search Algorithm [0.20971479389679332]
We propose an efficient version of the gradient-free Coordinate Search (CS) algorithm for training neural networks.
The proposed algorithm can be used with non-differentiable activation functions and tailored to multi-objective/multi-loss problems.
Finding the optimal values for weights of ANNs is a large-scale optimization problem.
arXiv Detail & Related papers (2024-02-20T01:47:25Z) - Universal Neural Functionals [67.80283995795985]
A challenging problem in many modern machine learning tasks is to process weight-space features.
Recent works have developed promising weight-space models that are equivariant to the permutation symmetries of simple feedforward networks.
This work proposes an algorithm that automatically constructs permutation equivariant models for any weight space.
arXiv Detail & Related papers (2024-02-07T20:12:27Z) - Post-Training Quantization for Re-parameterization via Coarse & Fine
Weight Splitting [13.270381125055275]
We propose a coarse & fine weight splitting (CFWS) method to reduce quantization error of weight.
We develop an improved KL metric to determine optimal quantization scales for activation.
For example, the quantized RepVGG-A1 model exhibits a mere 0.3% accuracy loss.
arXiv Detail & Related papers (2023-12-17T02:31:20Z) - NUPES : Non-Uniform Post-Training Quantization via Power Exponent Search [7.971065005161565]
quantization is a technique to convert floating point representations to low bit-width fixed point representations.
We show how to learn new quantized weights over the entire quantized space.
We show the ability of the method to achieve state-of-the-art compression rates in both, data-free and data-driven configurations.
arXiv Detail & Related papers (2023-08-10T14:19:58Z) - Recursive Euclidean Distance Based Robust Aggregation Technique For
Federated Learning [4.848016645393023]
Federated learning is a solution to data availability and privacy challenges in machine learning.
Malicious users aim to sabotage the collaborative learning process by training the local model with malicious data.
We propose a novel robust aggregation approach based on Euclidean distance calculation.
arXiv Detail & Related papers (2023-03-20T06:48:43Z) - Learning to Optimize Permutation Flow Shop Scheduling via Graph-based
Imitation Learning [70.65666982566655]
Permutation flow shop scheduling (PFSS) is widely used in manufacturing systems.
We propose to train the model via expert-driven imitation learning, which accelerates convergence more stably and accurately.
Our model's network parameters are reduced to only 37% of theirs, and the solution gap of our model towards the expert solutions decreases from 6.8% to 1.3% on average.
arXiv Detail & Related papers (2022-10-31T09:46:26Z) - Deep face recognition with clustering based domain adaptation [57.29464116557734]
We propose a new clustering-based domain adaptation method designed for face recognition task in which the source and target domain do not share any classes.
Our method effectively learns the discriminative target feature by aligning the feature domain globally, and, at the meantime, distinguishing the target clusters locally.
arXiv Detail & Related papers (2022-05-27T12:29:11Z) - Global-Local Regularization Via Distributional Robustness [26.983769514262736]
Deep neural networks are often vulnerable to adversarial examples and distribution shifts.
Recent approaches leverage distributional robustness optimization (DRO) to find the most challenging distribution.
We propose a novel regularization technique, following the veins of Wasserstein-based DRO framework.
arXiv Detail & Related papers (2022-03-01T15:36:12Z) - Region-Based Semantic Factorization in GANs [67.90498535507106]
We present a highly efficient algorithm to factorize the latent semantics learned by Generative Adversarial Networks (GANs) concerning an arbitrary image region.
Through an appropriately defined generalized Rayleigh quotient, we solve such a problem without any annotations or training.
Experimental results on various state-of-the-art GAN models demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-02-19T17:46:02Z) - HarmoFL: Harmonizing Local and Global Drifts in Federated Learning on
Heterogeneous Medical Images [19.62267284815759]
We introduce a new framework called HarmoFL to handle both local and global drifts.
HarmoFL mitigates the local update drift by normalizing amplitudes of images transformed into the frequency domain.
We show that HarmoFL outperforms a set of recent state-of-the-art methods with promising convergence behavior.
arXiv Detail & Related papers (2021-12-20T13:25:48Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - Coarse to Fine: Domain Adaptive Crowd Counting via Adversarial Scoring
Network [58.05473757538834]
This paper proposes a novel adversarial scoring network (ASNet) to bridge the gap across domains from coarse to fine granularity.
Three sets of migration experiments show that the proposed methods achieve state-of-the-art counting performance.
arXiv Detail & Related papers (2021-07-27T14:47:24Z) - Effective Model Sparsification by Scheduled Grow-and-Prune Methods [73.03533268740605]
We propose a novel scheduled grow-and-prune (GaP) methodology without pre-training the dense models.
Experiments have shown that such models can match or beat the quality of highly optimized dense models at 80% sparsity on a variety of tasks.
arXiv Detail & Related papers (2021-06-18T01:03:13Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.