Pretrained Domain-Specific Language Model for General Information
Retrieval Tasks in the AEC Domain
- URL: http://arxiv.org/abs/2203.04729v1
- Date: Wed, 9 Mar 2022 14:10:55 GMT
- Title: Pretrained Domain-Specific Language Model for General Information
Retrieval Tasks in the AEC Domain
- Authors: Zhe Zheng, Xin-Zheng Lu, Ke-Yin Chen, Yu-Cheng Zhou, Jia-Rui Lin
- Abstract summary: It is unclear how domain corpora and domain-specific pretrained DL models can improve performance in various information retrieval tasks.
This work explores the impacts of domain corpora and various transfer learning techniques on the performance of DL models for IR tasks.
BERT-based models dramatically outperform traditional methods in all IR tasks, with maximum improvements of 5.4% and 10.1% in the F1 score.
- Score: 5.949779668853556
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: As an essential task for the architecture, engineering, and construction
(AEC) industry, information retrieval (IR) from unstructured textual data based
on natural language processing (NLP) is gaining increasing attention. Although
various deep learning (DL) models for IR tasks have been investigated in the
AEC domain, it is still unclear how domain corpora and domain-specific
pretrained DL models can improve performance in various IR tasks. To this end,
this work systematically explores the impacts of domain corpora and various
transfer learning techniques on the performance of DL models for IR tasks and
proposes a pretrained domain-specific language model for the AEC domain. First,
both in-domain and close-domain corpora are developed. Then, two types of
pretrained models, including traditional wording embedding models and
BERT-based models, are pretrained based on various domain corpora and transfer
learning strategies. Finally, several widely used DL models for IR tasks are
further trained and tested based on various configurations and pretrained
models. The result shows that domain corpora have opposite effects on
traditional word embedding models for text classification and named entity
recognition tasks but can further improve the performance of BERT-based models
in all tasks. Meanwhile, BERT-based models dramatically outperform traditional
methods in all IR tasks, with maximum improvements of 5.4% and 10.1% in the F1
score, respectively. This research contributes to the body of knowledge in two
ways: 1) demonstrating the advantages of domain corpora and pretrained DL
models and 2) opening the first domain-specific dataset and pretrained language
model for the AEC domain, to the best of our knowledge. Thus, this work sheds
light on the adoption and application of pretrained models in the AEC domain.
Related papers
- Improving Few-Shot Cross-Domain Named Entity Recognition by Instruction Tuning a Word-Embedding based Retrieval Augmented Large Language Model [0.0]
Few-Shot Cross-Domain NER is a process of leveraging knowledge from data-rich source domains to perform entity recognition on data scarce target domains.
We propose IF-WRANER, a retrieval augmented large language model for Named Entity Recognition.
arXiv Detail & Related papers (2024-11-01T08:57:29Z) - Learning to Generalize Unseen Domains via Multi-Source Meta Learning for Text Classification [71.08024880298613]
We study the multi-source Domain Generalization of text classification.
We propose a framework to use multiple seen domains to train a model that can achieve high accuracy in an unseen domain.
arXiv Detail & Related papers (2024-09-20T07:46:21Z) - Boosting Large Language Models with Continual Learning for Aspect-based Sentiment Analysis [33.86086075084374]
Aspect-based sentiment analysis (ABSA) is an important subtask of sentiment analysis.
We propose a Large Language Model-based Continual Learning (textttLLM-CL) model for ABSA.
arXiv Detail & Related papers (2024-05-09T02:00:07Z) - Improving Domain Generalization with Domain Relations [77.63345406973097]
This paper focuses on domain shifts, which occur when the model is applied to new domains that are different from the ones it was trained on.
We propose a new approach called D$3$G to learn domain-specific models.
Our results show that D$3$G consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-02-06T08:11:16Z) - QAGAN: Adversarial Approach To Learning Domain Invariant Language
Features [0.76146285961466]
We explore adversarial training approach towards learning domain-invariant features.
We are able to achieve $15.2%$ improvement in EM score and $5.6%$ boost in F1 score on out-of-domain validation dataset.
arXiv Detail & Related papers (2022-06-24T17:42:18Z) - TAL: Two-stream Adaptive Learning for Generalizable Person
Re-identification [115.31432027711202]
We argue that both domain-specific and domain-invariant features are crucial for improving the generalization ability of re-id models.
We name two-stream adaptive learning (TAL) to simultaneously model these two kinds of information.
Our framework can be applied to both single-source and multi-source domain generalization tasks.
arXiv Detail & Related papers (2021-11-29T01:27:42Z) - Efficient Domain Adaptation of Language Models via Adaptive Tokenization [5.058301279065432]
We show that domain-specific subword sequences can be efficiently determined directly from divergences in the conditional token distributions of the base and domain-specific corpora.
Our approach produces smaller models and less training and inference time than other approaches using tokenizer augmentation.
arXiv Detail & Related papers (2021-09-15T17:51:27Z) - Adapt-and-Distill: Developing Small, Fast and Effective Pretrained
Language Models for Domains [45.07506437436464]
We present a general approach to developing small, fast and effective pre-trained models for specific domains.
This is achieved by adapting the off-the-shelf general pre-trained models and performing task-agnostic knowledge distillation in target domains.
arXiv Detail & Related papers (2021-06-25T07:37:05Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - RADDLE: An Evaluation Benchmark and Analysis Platform for Robust
Task-oriented Dialog Systems [75.87418236410296]
We introduce the RADDLE benchmark, a collection of corpora and tools for evaluating the performance of models across a diverse set of domains.
RADDLE is designed to favor and encourage models with a strong generalization ability.
We evaluate recent state-of-the-art systems based on pre-training and fine-tuning, and find that grounded pre-training on heterogeneous dialog corpora performs better than training a separate model per domain.
arXiv Detail & Related papers (2020-12-29T08:58:49Z) - DomBERT: Domain-oriented Language Model for Aspect-based Sentiment
Analysis [71.40586258509394]
We propose DomBERT, an extension of BERT to learn from both in-domain corpus and relevant domain corpora.
Experiments are conducted on an assortment of tasks in aspect-based sentiment analysis, demonstrating promising results.
arXiv Detail & Related papers (2020-04-28T21:07:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.