Multi-Source (Pre-)Training for Cross-Domain Measurement, Unit and
Context Extraction
- URL: http://arxiv.org/abs/2308.02951v1
- Date: Sat, 5 Aug 2023 20:33:39 GMT
- Title: Multi-Source (Pre-)Training for Cross-Domain Measurement, Unit and
Context Extraction
- Authors: Yueling Li, Sebastian Martschat, Simone Paolo Ponzetto
- Abstract summary: We present a cross-domain approach for automated measurement and context extraction based on pre-trained language models.
We construct a multi-source, multi-domain corpus and train an end-to-end extraction pipeline.
Our results suggest that multi-source training leads to the best overall results, while single-source training yields the best results for the respective individual domain.
- Score: 15.177664715250046
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a cross-domain approach for automated measurement and context
extraction based on pre-trained language models. We construct a multi-source,
multi-domain corpus and train an end-to-end extraction pipeline. We then apply
multi-source task-adaptive pre-training and fine-tuning to benchmark the
cross-domain generalization capability of our model. Further, we conceptualize
and apply a task-specific error analysis and derive insights for future work.
Our results suggest that multi-source training leads to the best overall
results, while single-source training yields the best results for the
respective individual domain. While our setup is successful at extracting
quantity values and units, more research is needed to improve the extraction of
contextual entities. We make the cross-domain corpus used in this work
available online.
Related papers
- Investigating the potential of Sparse Mixtures-of-Experts for multi-domain neural machine translation [59.41178047749177]
We focus on multi-domain Neural Machine Translation, with the goal of developing efficient models which can handle data from various domains seen during training and are robust to domains unseen during training.
We hypothesize that Sparse Mixture-of-Experts (SMoE) models are a good fit for this task, as they enable efficient model scaling.
We conduct a series of experiments aimed at validating the utility of SMoE for the multi-domain scenario, and find that a straightforward width scaling of Transformer is a simpler and surprisingly more efficient approach in practice, and reaches the same performance level as SMoE.
arXiv Detail & Related papers (2024-07-01T09:45:22Z) - Scalarization for Multi-Task and Multi-Domain Learning at Scale [15.545810422759295]
Training a single model on multiple input domains and/or output tasks allows for compressing information from multiple sources into a unified backbone.
However, optimizing such networks is a challenge due to discrepancies between the different tasks or domains.
arXiv Detail & Related papers (2023-10-13T07:31:04Z) - ZhichunRoad at Amazon KDD Cup 2022: MultiTask Pre-Training for
E-Commerce Product Search [4.220439000486713]
We propose a robust multilingual model to improve the quality of search results.
In pre-training stage, we adopt mlm task, classification task and contrastive learning task.
In fine-tuning stage, we use confident learning, exponential moving average method (EMA), adversarial training (FGM) and regularized dropout strategy (R-Drop)
arXiv Detail & Related papers (2023-01-31T07:31:34Z) - MultiMatch: Multi-task Learning for Semi-supervised Domain Generalization [55.06956781674986]
We resort to solving the semi-supervised domain generalization task, where there are a few label information in each source domain.
We propose MultiMatch, extending FixMatch to the multi-task learning framework, producing the high-quality pseudo-label for SSDG.
A series of experiments validate the effectiveness of the proposed method, and it outperforms the existing semi-supervised methods and the SSDG method on several benchmark DG datasets.
arXiv Detail & Related papers (2022-08-11T14:44:33Z) - Incremental Learning Meets Transfer Learning: Application to Multi-site
Prostate MRI Segmentation [16.50535949349874]
We propose a novel multi-site segmentation framework called incremental-transfer learning (ITL)
ITL learns a model from multi-site datasets in an end-to-end sequential fashion.
We show for the first time that leveraging our ITL training scheme is able to alleviate challenging catastrophic problems in incremental learning.
arXiv Detail & Related papers (2022-06-03T02:32:01Z) - HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain
Language Model Compression [53.90578309960526]
Large pre-trained language models (PLMs) have shown overwhelming performances compared with traditional neural network methods.
We propose a hierarchical relational knowledge distillation (HRKD) method to capture both hierarchical and domain relational information.
arXiv Detail & Related papers (2021-10-16T11:23:02Z) - Multimodal Clustering Networks for Self-supervised Learning from
Unlabeled Videos [69.61522804742427]
This paper proposes a self-supervised training framework that learns a common multimodal embedding space.
We extend the concept of instance-level contrastive learning with a multimodal clustering step to capture semantic similarities across modalities.
The resulting embedding space enables retrieval of samples across all modalities, even from unseen datasets and different domains.
arXiv Detail & Related papers (2021-04-26T15:55:01Z) - Universal Representation Learning from Multiple Domains for Few-shot
Classification [41.821234589075445]
We propose to learn a single set of universal deep representations by distilling knowledge of multiple separately trained networks.
We show that the universal representations can be further refined for previously unseen domains by an efficient adaptation step.
arXiv Detail & Related papers (2021-03-25T13:49:12Z) - Don't Stop Pretraining: Adapt Language Models to Domains and Tasks [81.99843216550306]
We present a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks.
A second phase of pretraining in-domain (domain-adaptive pretraining) leads to performance gains.
Adapting to the task's unlabeled data (task-adaptive pretraining) improves performance even after domain-adaptive pretraining.
arXiv Detail & Related papers (2020-04-23T04:21:19Z) - Zero-Resource Cross-Domain Named Entity Recognition [68.83177074227598]
Existing models for cross-domain named entity recognition rely on numerous unlabeled corpus or labeled NER training data in target domains.
We propose a cross-domain NER model that does not use any external resources.
arXiv Detail & Related papers (2020-02-14T09:04:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.