Related papers: Merge-Friendly Post-Training Quantization for Multi-Target Domain Adaptation

Merge-Friendly Post-Training Quantization for Multi-Target Domain Adaptation

URL: http://arxiv.org/abs/2505.23651v1
Date: Thu, 29 May 2025 17:00:56 GMT
Title: Merge-Friendly Post-Training Quantization for Multi-Target Domain Adaptation
Authors: Juncheol Shin, Minsang Seok, Seonggon Kim, Eunhyeok Park,
Abstract summary: In this study, we analyze the impact of quantization on model merging through the lens of error barriers.<n>We propose a novel post-training quantization, HDRQ - Hessian and distant regularizing quantization, that is designed to consider model merging for multi-target domain adaptation.<n>Our approach ensures that the quantization process incurs minimal deviation from the source pre-trained model while flattening the loss surface to facilitate smooth model merging.
Score: 7.193483612237862
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Model merging has emerged as a powerful technique for combining task-specific weights, achieving superior performance in multi-target domain adaptation. However, when applied to practical scenarios, such as quantized models, new challenges arise. In practical scenarios, quantization is often applied to target-specific data, but this process restricts the domain of interest and introduces discretization effects, making model merging highly non-trivial. In this study, we analyze the impact of quantization on model merging through the lens of error barriers. Leveraging these insights, we propose a novel post-training quantization, HDRQ - Hessian and distant regularizing quantization - that is designed to consider model merging for multi-target domain adaptation. Our approach ensures that the quantization process incurs minimal deviation from the source pre-trained model while flattening the loss surface to facilitate smooth model merging. To our knowledge, this is the first study on this challenge, and extensive experiments confirm its effectiveness.

Related papers

Quantization-Aware Collaborative Inference for Large Embodied AI Models [67.66340659245186]
Large artificial intelligence models (LAIMs) are increasingly regarded as a core intelligence engine for embodied AI applications.<n>To address this issue, we investigate quantization-aware collaborative inference (co-inference) for embodied AI systems.
arXiv Detail & Related papers (2026-02-13T16:08:19Z)
LSGQuant: Layer-Sensitivity Guided Quantization for One-Step Diffusion Real-World Video Super-Resolution [52.627063566555194]
We introduce LSGQuant, a layer-sensitivity guided quantizing approach for one-step diffusion-based real-world VSR.<n>Our method incorporates a Dynamic Range Adaptive Quantizer (DRAQ) to fit video token activations.<n>Our method has nearly performance to origin model with full-precision and significantly exceeds existing quantization techniques.
arXiv Detail & Related papers (2026-02-03T06:53:19Z)
Post-Training Non-Uniform Quantization for Convolutional Neural Networks [0.0]
Quantization is a technique that aims to alleviate large storage requirements and speed up the inference process.<n>In this paper, we introduce a novel post-training quantization method for model weights.<n>Our method finds optimal clipping thresholds and scaling factors along with mathematical guarantees that our method minimizes quantization noise.
arXiv Detail & Related papers (2024-12-10T10:33:58Z)
QT-DoG: Quantization-aware Training for Domain Generalization [58.439816306817306]
We propose Quantization-aware Training for Domain Generalization (QT-DoG) QT-DoG exploits quantization as an implicit regularizer by inducing noise in model weights. We demonstrate that QT-DoG generalizes across various datasets, architectures, and quantization algorithms.
arXiv Detail & Related papers (2024-10-08T13:21:48Z)
Investigating the Impact of Quantization on Adversarial Robustness [22.637585106574722]
Quantization is a technique for reducing the bit-width of deep models to improve their runtime performance and storage efficiency. In real-world scenarios, quantized models are often faced with adversarial attacks which cause the model to make incorrect inferences. We conduct a first-time analysis of the impact of the quantization pipeline components that can incorporate robust optimization.
arXiv Detail & Related papers (2024-04-08T16:20:15Z)
QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning [52.157939524815866]
In this paper, we identify imbalanced activation distributions as a primary source of quantization difficulty.<n>We propose to adjust these distributions through weight finetuning to be more quantization-friendly.<n>Our method demonstrates its efficacy across three high-resolution image generation tasks.
arXiv Detail & Related papers (2024-02-06T03:39:44Z)
AdaMerging: Adaptive Model Merging for Multi-Task Learning [68.75885518081357]
This paper introduces an innovative technique called Adaptive Model Merging (AdaMerging) It aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data. Compared to the current state-of-the-art task arithmetic merging scheme, AdaMerging showcases a remarkable 11% improvement in performance.
arXiv Detail & Related papers (2023-10-04T04:26:33Z)
Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption [73.98706049140098]
We propose a novel phasic content fusing few-shot diffusion model with directional distribution consistency loss. Specifically, we design a phasic training strategy with phasic content fusion to help our model learn content and style information when t is large. Finally, we propose a cross-domain structure guidance strategy that enhances structure consistency during domain adaptation.
arXiv Detail & Related papers (2023-09-07T14:14:11Z)
PreQuant: A Task-agnostic Quantization Approach for Pre-trained Language Models [52.09865918265002]
We propose a novel quantize before fine-tuning'' framework, PreQuant. PreQuant is compatible with various quantization strategies, with outlier-aware fine-tuning incorporated to correct the induced quantization error. We demonstrate the effectiveness of PreQuant on the GLUE benchmark using BERT, RoBERTa, and T5.
arXiv Detail & Related papers (2023-05-30T08:41:33Z)
When to Update Your Model: Constrained Model-based Reinforcement Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL) Our follow-up derived bounds reveal the relationship between model shifts and performance improvement. A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z)
Toward Certified Robustness Against Real-World Distribution Shifts [65.66374339500025]
We train a generative model to learn perturbations from data and define specifications with respect to the output of the learned model. A unique challenge arising from this setting is that existing verifiers cannot tightly approximate sigmoid activations. We propose a general meta-algorithm for handling sigmoid activations which leverages classical notions of counter-example-guided abstraction refinement.
arXiv Detail & Related papers (2022-06-08T04:09:13Z)
Learning Multi-Task Gaussian Process Over Heterogeneous Input Domains [27.197576157695096]
Multi-task Gaussian process (MTGP) is a well-known non-parametric Bayesian model for learning correlated tasks. This paper presents a novel heterogeneous variational linear model of coregionalization (HSVLMC) model for simultaneously learning the tasks with varied input domains.
arXiv Detail & Related papers (2022-02-25T11:55:09Z)
Scalable Multi-Task Gaussian Processes with Neural Embedding of Coregionalization [9.873139480223367]
Multi-task regression attempts to exploit the task similarity in order to achieve knowledge transfer across related tasks for performance improvement. The linear model of coregionalization (LMC) is a well-known MTGP paradigm which exploits the dependency of tasks through linear combination of several independent and diverse GPs. We develop the neural embedding of coregionalization that transforms the latent GPs into a high-dimensional latent space to induce rich yet diverse behaviors.
arXiv Detail & Related papers (2021-09-20T01:28:14Z)
Zero-shot Adversarial Quantization [11.722728148523366]
We propose a zero-shot adversarial quantization (ZAQ) framework, facilitating effective discrepancy estimation and knowledge transfer. This is achieved by a novel two-level discrepancy modeling to drive a generator to synthesize informative and diverse data examples. We conduct extensive experiments on three fundamental vision tasks, demonstrating the superiority of ZAQ over the strong zero-shot baselines.
arXiv Detail & Related papers (2021-03-29T01:33:34Z)
Flexible Model Aggregation for Quantile Regression [92.63075261170302]
Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions. We investigate methods for aggregating any number of conditional quantile models. All of the models we consider in this paper can be fit using modern deep learning toolkits.
arXiv Detail & Related papers (2021-02-26T23:21:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.