Domain-adaptation of spherical embeddings
- URL: http://arxiv.org/abs/2111.00677v1
- Date: Mon, 1 Nov 2021 03:29:36 GMT
- Title: Domain-adaptation of spherical embeddings
- Authors: Mihalis Gongolidis, Jeremy Minton, Ronin Wu, Valentin Stauber, Jason
Hoelscher-Obermaier and Viktor Botev
- Abstract summary: We develop methods to counter the global rotation of the embedding space and propose strategies to update words and documents during domain specific training.
We show that our strategies are able to reduce the performance cost of domain adaptation to a level similar to Word2Vec.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Domain adaptation of embedding models, updating a generic embedding to the
language of a specific domain, is a proven technique for domains that have
insufficient data to train an effective model from scratch. Chemistry
publications is one such domain, where scientific jargon and overloaded
terminology inhibit the performance of a general language model. The recent
spherical embedding model (JoSE) proposed in arXiv:1911.01196 jointly learns
word and document embeddings during training on the multi-dimensional unit
sphere, which performs well for document classification and word correlation
tasks. But, we show a non-convergence caused by global rotations during its
training prevents it from domain adaptation. In this work, we develop methods
to counter the global rotation of the embedding space and propose strategies to
update words and documents during domain specific training. Two new document
classification data-sets are collated from general and chemistry scientific
journals to compare the proposed update training strategies with benchmark
models. We show that our strategies are able to reduce the performance cost of
domain adaptation to a level similar to Word2Vec.
Related papers
- StylePrompter: Enhancing Domain Generalization with Test-Time Style Priors [39.695604434738186]
In real-world applications, the sample distribution at the inference stage often differs from the one at the training stage.
This paper introduces the style prompt in the language modality to adapt the trained model dynamically.
In particular, we train a style prompter to extract style information of the current image into an embedding in the token embedding space.
Our open space partition of the style token embedding space and the hand-crafted style regularization enable the trained style prompter to handle data from unknown domains effectively.
arXiv Detail & Related papers (2024-08-17T08:35:43Z) - A Unified Data Augmentation Framework for Low-Resource Multi-Domain Dialogue Generation [52.0964459842176]
Current state-of-the-art dialogue systems heavily rely on extensive training datasets.
We propose a novel data textbfAugmentation framework for textbfMulti-textbfDomain textbfDialogue textbfGeneration, referred to as textbfAMD$2$G.
The AMD$2$G framework consists of a data augmentation process and a two-stage training approach: domain-agnostic training and domain adaptation training.
arXiv Detail & Related papers (2024-06-14T09:52:27Z) - AHAM: Adapt, Help, Ask, Model -- Harvesting LLMs for literature mining [3.8384235322772864]
We present the AHAM' methodology and a metric that guides the domain-specific textbfadaptation of the BERTopic topic modeling framework.
By utilizing the LLaMa2 generative language model, we generate topic definitions via one-shot learning.
For inter-topic similarity evaluation, we leverage metrics from language generation and translation processes.
arXiv Detail & Related papers (2023-12-25T18:23:03Z) - Improving Domain Generalization with Domain Relations [77.63345406973097]
This paper focuses on domain shifts, which occur when the model is applied to new domains that are different from the ones it was trained on.
We propose a new approach called D$3$G to learn domain-specific models.
Our results show that D$3$G consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-02-06T08:11:16Z) - Domain Generalization via Gradient Surgery [5.38147998080533]
In real-life applications, machine learning models often face scenarios where there is a change in data distribution between training and test domains.
In this work, we characterize the conflicting gradients emerging in domain shift scenarios and devise novel gradient agreement strategies.
arXiv Detail & Related papers (2021-08-03T16:49:25Z) - f-Domain-Adversarial Learning: Theory and Algorithms [82.97698406515667]
Unsupervised domain adaptation is used in many machine learning applications where, during training, a model has access to unlabeled data in the target domain.
We derive a novel generalization bound for domain adaptation that exploits a new measure of discrepancy between distributions based on a variational characterization of f-divergences.
arXiv Detail & Related papers (2021-06-21T18:21:09Z) - Iterative Domain-Repaired Back-Translation [50.32925322697343]
In this paper, we focus on the domain-specific translation with low resources, where in-domain parallel corpora are scarce or nonexistent.
We propose a novel iterative domain-repaired back-translation framework, which introduces the Domain-Repair model to refine translations in synthetic bilingual data.
Experiments on adapting NMT models between specific domains and from the general domain to specific domains demonstrate the effectiveness of our proposed approach.
arXiv Detail & Related papers (2020-10-06T04:38:09Z) - Coupling Distant Annotation and Adversarial Training for Cross-Domain
Chinese Word Segmentation [40.27961925319402]
This paper proposes to couple distant annotation and adversarial training for cross-domain Chinese word segmentation.
For distant annotation, we design an automatic distant annotation mechanism that does not need any supervision or pre-defined dictionaries from the target domain.
For adversarial training, we develop a sentence-level training procedure to perform noise reduction and maximum utilization of the source domain information.
arXiv Detail & Related papers (2020-07-16T08:54:17Z) - Domain Adaptation for Semantic Parsing [68.81787666086554]
We propose a novel semantic for domain adaptation, where we have much fewer annotated data in the target domain compared to the source domain.
Our semantic benefits from a two-stage coarse-to-fine framework, thus can provide different and accurate treatments for the two stages.
Experiments on a benchmark dataset show that our method consistently outperforms several popular domain adaptation strategies.
arXiv Detail & Related papers (2020-06-23T14:47:41Z) - Dynamic Data Selection and Weighting for Iterative Back-Translation [116.14378571769045]
We propose a curriculum learning strategy for iterative back-translation models.
We evaluate our models on domain adaptation, low-resource, and high-resource MT settings.
Experimental results demonstrate that our methods achieve improvements of up to 1.8 BLEU points over competitive baselines.
arXiv Detail & Related papers (2020-04-07T19:49:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.