Related papers: Domain-adaptation of spherical embeddings

Domain-adaptation of spherical embeddings

URL: http://arxiv.org/abs/2111.00677v1
Date: Mon, 1 Nov 2021 03:29:36 GMT
Title: Domain-adaptation of spherical embeddings
Authors: Mihalis Gongolidis, Jeremy Minton, Ronin Wu, Valentin Stauber, Jason Hoelscher-Obermaier and Viktor Botev
Abstract summary: We develop methods to counter the global rotation of the embedding space and propose strategies to update words and documents during domain specific training. We show that our strategies are able to reduce the performance cost of domain adaptation to a level similar to Word2Vec.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Domain adaptation of embedding models, updating a generic embedding to the language of a specific domain, is a proven technique for domains that have insufficient data to train an effective model from scratch. Chemistry publications is one such domain, where scientific jargon and overloaded terminology inhibit the performance of a general language model. The recent spherical embedding model (JoSE) proposed in arXiv:1911.01196 jointly learns word and document embeddings during training on the multi-dimensional unit sphere, which performs well for document classification and word correlation tasks. But, we show a non-convergence caused by global rotations during its training prevents it from domain adaptation. In this work, we develop methods to counter the global rotation of the embedding space and propose strategies to update words and documents during domain specific training. Two new document classification data-sets are collated from general and chemistry scientific journals to compare the proposed update training strategies with benchmark models. We show that our strategies are able to reduce the performance cost of domain adaptation to a level similar to Word2Vec.

Related papers

The OCR Quest for Generalization: Learning to recognize low-resource alphabets with model editing [2.7471068141502]
We aim for building models which can generalize to new distributions of data, such as alphabets, faster than centralized fine-tune strategies.<n>In contrast to state-of-the-art meta-learning, we showcase the effectiveness of domain merging in sparse distributions of data.<n>This research contributes a novel approach into building models that can easily adopt under-represented alphabets.
arXiv Detail & Related papers (2025-06-07T11:05:33Z)
Learning to Generalize Unseen Domains via Multi-Source Meta Learning for Text Classification [71.08024880298613]
We study the multi-source Domain Generalization of text classification. We propose a framework to use multiple seen domains to train a model that can achieve high accuracy in an unseen domain.
arXiv Detail & Related papers (2024-09-20T07:46:21Z)
StylePrompter: Enhancing Domain Generalization with Test-Time Style Priors [39.695604434738186]
In real-world applications, the sample distribution at the inference stage often differs from the one at the training stage. This paper introduces the style prompt in the language modality to adapt the trained model dynamically. In particular, we train a style prompter to extract style information of the current image into an embedding in the token embedding space. Our open space partition of the style token embedding space and the hand-crafted style regularization enable the trained style prompter to handle data from unknown domains effectively.
arXiv Detail & Related papers (2024-08-17T08:35:43Z)
A Unified Data Augmentation Framework for Low-Resource Multi-Domain Dialogue Generation [52.0964459842176]
Current state-of-the-art dialogue systems heavily rely on extensive training datasets. We propose a novel data textbfAugmentation framework for textbfMulti-textbfDomain textbfDialogue textbfGeneration, referred to as textbfAMD$2$G. The AMD$2$G framework consists of a data augmentation process and a two-stage training approach: domain-agnostic training and domain adaptation training.
arXiv Detail & Related papers (2024-06-14T09:52:27Z)
AHAM: Adapt, Help, Ask, Model -- Harvesting LLMs for literature mining [3.8384235322772864]
We present the AHAM' methodology and a metric that guides the domain-specific textbfadaptation of the BERTopic topic modeling framework. By utilizing the LLaMa2 generative language model, we generate topic definitions via one-shot learning. For inter-topic similarity evaluation, we leverage metrics from language generation and translation processes.
arXiv Detail & Related papers (2023-12-25T18:23:03Z)
Improving Domain Generalization with Domain Relations [77.63345406973097]
This paper focuses on domain shifts, which occur when the model is applied to new domains that are different from the ones it was trained on. We propose a new approach called D$3$G to learn domain-specific models. Our results show that D$3$G consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-02-06T08:11:16Z)
Domain Generalization via Gradient Surgery [5.38147998080533]
In real-life applications, machine learning models often face scenarios where there is a change in data distribution between training and test domains. In this work, we characterize the conflicting gradients emerging in domain shift scenarios and devise novel gradient agreement strategies.
arXiv Detail & Related papers (2021-08-03T16:49:25Z)
f-Domain-Adversarial Learning: Theory and Algorithms [82.97698406515667]
Unsupervised domain adaptation is used in many machine learning applications where, during training, a model has access to unlabeled data in the target domain. We derive a novel generalization bound for domain adaptation that exploits a new measure of discrepancy between distributions based on a variational characterization of f-divergences.
arXiv Detail & Related papers (2021-06-21T18:21:09Z)
Iterative Domain-Repaired Back-Translation [50.32925322697343]
In this paper, we focus on the domain-specific translation with low resources, where in-domain parallel corpora are scarce or nonexistent. We propose a novel iterative domain-repaired back-translation framework, which introduces the Domain-Repair model to refine translations in synthetic bilingual data. Experiments on adapting NMT models between specific domains and from the general domain to specific domains demonstrate the effectiveness of our proposed approach.
arXiv Detail & Related papers (2020-10-06T04:38:09Z)
Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation [40.27961925319402]
This paper proposes to couple distant annotation and adversarial training for cross-domain Chinese word segmentation. For distant annotation, we design an automatic distant annotation mechanism that does not need any supervision or pre-defined dictionaries from the target domain. For adversarial training, we develop a sentence-level training procedure to perform noise reduction and maximum utilization of the source domain information.
arXiv Detail & Related papers (2020-07-16T08:54:17Z)
Domain Adaptation for Semantic Parsing [68.81787666086554]
We propose a novel semantic for domain adaptation, where we have much fewer annotated data in the target domain compared to the source domain. Our semantic benefits from a two-stage coarse-to-fine framework, thus can provide different and accurate treatments for the two stages. Experiments on a benchmark dataset show that our method consistently outperforms several popular domain adaptation strategies.
arXiv Detail & Related papers (2020-06-23T14:47:41Z)
Dynamic Data Selection and Weighting for Iterative Back-Translation [116.14378571769045]
We propose a curriculum learning strategy for iterative back-translation models. We evaluate our models on domain adaptation, low-resource, and high-resource MT settings. Experimental results demonstrate that our methods achieve improvements of up to 1.8 BLEU points over competitive baselines.
arXiv Detail & Related papers (2020-04-07T19:49:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.