Related papers: GAPrune: Gradient-Alignment Pruning for Domain-Aware Embeddings

GAPrune: Gradient-Alignment Pruning for Domain-Aware Embeddings

URL: http://arxiv.org/abs/2509.10844v1
Date: Sat, 13 Sep 2025 15:03:37 GMT
Title: GAPrune: Gradient-Alignment Pruning for Domain-Aware Embeddings
Authors: Yixuan Tang, Yi Yang,
Abstract summary: Domain-specific embedding models have shown promise for applications that require specialized semantic understanding.<n>Model compression through pruning offers a promising solution, but existing pruning methods treat all parameters uniformly.<n>We propose GAPrune, a pruning framework that considers both domain importance and preserving general linguistic foundation.
Score: 12.949322198287417
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Domain-specific embedding models have shown promise for applications that require specialized semantic understanding, such as coding agents and financial retrieval systems, often achieving higher performance gains than general models. However, state-of-the-art embedding models are typically based on LLMs, which contain billions of parameters, making deployment challenging in resource-constrained environments. Model compression through pruning offers a promising solution, but existing pruning methods treat all parameters uniformly, failing to distinguish between general semantic representations and domain-specific patterns, leading to suboptimal pruning decisions. Thus, we propose GAPrune, a pruning framework that addresses this challenge by considering both domain importance and preserving general linguistic foundation. Our method uses Fisher Information to measure importance and general-domain gradient alignment to assess parameter behavior, then combines these signals using our Domain Alignment Importance (DAI) scoring. Lower DAI scores indicate that the parameter is either less important for the domain task or creates conflicts between domain and general objectives. Experiments on two domain benchmarks, FinMTEB and ChemTEB, show that GAPrune maintains performance within 2.5% of dense models in one-shot pruning at 50% sparsity, while outperforming all baselines. With retraining in 100 steps, GAPrune achieves +4.51% improvement on FinMTEB and +1.73% on ChemTEB, demonstrating that our pruning strategy not only preserves but enhances domain-specific capabilities. Our findings demonstrate that principled pruning strategies can achieve model compression and enhanced domain specialization, providing the research community with a new approach for development.

Related papers

DSP-Reg: Domain-Sensitive Parameter Regularization for Robust Domain Generalization [21.0252973774713]
Domain Generalization is a critical area that focuses on developing models capable of performing well on data from unseen distributions.<n>Existing approaches primarily concentrate on learning domain-invariant features, which assume that a model robust to variations in the source domains will generalize well to unseen target domains.<n>We propose Domain-Sensitive Regularization (DSP-Reg), a principled framework that guides model optimization by a soft regularization technique.
arXiv Detail & Related papers (2026-01-27T09:24:51Z)
Gradient-Guided Annealing for Domain Generalization [5.124256074746721]
Gradient-Guided Annealing (GGA) algorithm is proposed to improve domain generalization effectiveness.<n>The efficacy of GGA is evaluated on five widely accepted and challenging image classification domain generalization benchmarks.
arXiv Detail & Related papers (2025-02-27T15:01:55Z)
Federated Domain Generalization with Data-free On-server Matching Gradient [6.817783565501387]
Domain Generalization (DG) aims to learn from multiple known source domains a model that can generalize well to unknown target domains.<n>In this paper, we introduce a novel approach, dubbed Federated Learning via On-server Matching Gradient (FedOMG), which can emphefficiently leverage domain information from distributed domains.
arXiv Detail & Related papers (2025-01-24T17:20:22Z)
Exploiting Aggregation and Segregation of Representations for Domain Adaptive Human Pose Estimation [50.31351006532924]
Human pose estimation (HPE) has received increasing attention recently due to its wide application in motion analysis, virtual reality, healthcare, etc.<n>It suffers from the lack of labeled diverse real-world datasets due to the time- and labor-intensive annotation.<n>We introduce a novel framework that capitalizes on both representation aggregation and segregation for domain adaptive human pose estimation.
arXiv Detail & Related papers (2024-12-29T17:59:45Z)
StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization [85.18995948334592]
Single domain generalization (single DG) aims at learning a robust model generalizable to unseen domains from only one training domain. State-of-the-art approaches have mostly relied on data augmentations, such as adversarial perturbation and style enhancement, to synthesize new data. We propose emphStyDeSty, which explicitly accounts for the alignment of the source and pseudo domains in the process of data augmentation.
arXiv Detail & Related papers (2024-06-01T02:41:34Z)
DoGE: Domain Reweighting with Generalization Estimation [42.32000165235568]
We propose DOmain reweighting with Generalization Estimation (DoGE) In our experiments, we extensively show how DoGE improves the generalization of the base model to any target data mixture. DoGE can effectively identify inter-domain dependencies, and consistently achieves better test perplexity on the target domain.
arXiv Detail & Related papers (2023-10-23T22:51:58Z)
Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction [67.54420015049732]
Aspect Sentiment Triplet Extraction (ASTE) is a challenging task in sentiment analysis, aiming to provide fine-grained insights into human sentiments. Existing benchmarks are limited to two domains and do not evaluate model performance on unseen domains. We introduce a domain-expanded benchmark by annotating samples from diverse domains, enabling evaluation of models in both in-domain and out-of-domain settings.
arXiv Detail & Related papers (2023-05-23T18:01:49Z)
Compound Domain Generalization via Meta-Knowledge Encoding [55.22920476224671]
We introduce Style-induced Domain-specific Normalization (SDNorm) to re-normalize the multi-modal underlying distributions. We harness the prototype representations, the centroids of classes, to perform relational modeling in the embedding space. Experiments on four standard Domain Generalization benchmarks reveal that COMEN exceeds the state-of-the-art performance without the need of domain supervision.
arXiv Detail & Related papers (2022-03-24T11:54:59Z)
Domain Generalisation for Object Detection under Covariate and Concept Shift [10.32461766065764]
Domain generalisation aims to promote the learning of domain-invariant features while suppressing domain-specific features. An approach to domain generalisation for object detection is proposed, the first such approach applicable to any object detection architecture.
arXiv Detail & Related papers (2022-03-10T11:14:18Z)
META: Mimicking Embedding via oThers' Aggregation for Generalizable Person Re-identification [68.39849081353704]
Domain generalizable (DG) person re-identification (ReID) aims to test across unseen domains without access to the target domain data at training time. This paper presents a new approach called Mimicking Embedding via oThers' Aggregation (META) for DG ReID.
arXiv Detail & Related papers (2021-12-16T08:06:50Z)
Cluster, Split, Fuse, and Update: Meta-Learning for Open Compound Domain Adaptive Semantic Segmentation [102.42638795864178]
We propose a principled meta-learning based approach to OCDA for semantic segmentation. We cluster target domain into multiple sub-target domains by image styles, extracted in an unsupervised manner. A meta-learner is thereafter deployed to learn to fuse sub-target domain-specific predictions, conditioned upon the style code. We learn to online update the model by model-agnostic meta-learning (MAML) algorithm, thus to further improve generalization.
arXiv Detail & Related papers (2020-12-15T13:21:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.