Adapting Sentence Transformers for the Aviation Domain
- URL: http://arxiv.org/abs/2305.09556v2
- Date: Wed, 29 Nov 2023 14:45:46 GMT
- Title: Adapting Sentence Transformers for the Aviation Domain
- Authors: Liya Wang, Jason Chou, Dave Rouck, Alex Tien, Diane M Baumgartner
- Abstract summary: We propose a novel approach for adapting sentence transformers for the aviation domain.
Our method is a two-stage process consisting of pre-training followed by fine-tuning.
Our work highlights the importance of domain-specific adaptation in developing high-quality NLP solutions for specialized industries like aviation.
- Score: 0.8437187555622164
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning effective sentence representations is crucial for many Natural
Language Processing (NLP) tasks, including semantic search, semantic textual
similarity (STS), and clustering. While multiple transformer models have been
developed for sentence embedding learning, these models may not perform
optimally when dealing with specialized domains like aviation, which has unique
characteristics such as technical jargon, abbreviations, and unconventional
grammar. Furthermore, the absence of labeled datasets makes it difficult to
train models specifically for the aviation domain. To address these challenges,
we propose a novel approach for adapting sentence transformers for the aviation
domain. Our method is a two-stage process consisting of pre-training followed
by fine-tuning. During pre-training, we use Transformers and Sequential
Denoising AutoEncoder (TSDAE) with aviation text data as input to improve the
initial model performance. Subsequently, we fine-tune our models using a
Natural Language Inference (NLI) dataset in the Sentence Bidirectional Encoder
Representations from Transformers (SBERT) architecture to mitigate overfitting
issues. Experimental results on several downstream tasks show that our adapted
sentence transformers significantly outperform general-purpose transformers,
demonstrating the effectiveness of our approach in capturing the nuances of the
aviation domain. Overall, our work highlights the importance of domain-specific
adaptation in developing high-quality NLP solutions for specialized industries
like aviation.
Related papers
- Nonparametric Variational Regularisation of Pretrained Transformers [15.313475675235843]
We propose Non Variational Information Bottleneck (NVIB) as a regulariser for training cross-attention in Transformers.
We show that changing the initialisation introduces a novel, information-theoretic post-training regularisation in the attention mechanism.
arXiv Detail & Related papers (2023-12-01T15:40:30Z) - Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z) - TADA: Efficient Task-Agnostic Domain Adaptation for Transformers [3.9379577980832843]
In this work, we introduce TADA, a novel task-agnostic domain adaptation method.
Within TADA, we retrain embeddings to learn domain-aware input representations and tokenizers for the transformer encoder.
We conduct experiments with meta-embeddings and newly introduced meta-tokenizers, resulting in one model per task in multi-domain use cases.
arXiv Detail & Related papers (2023-05-22T04:53:59Z) - Transformer-based approaches to Sentiment Detection [55.41644538483948]
We examined the performance of four different types of state-of-the-art transformer models for text classification.
The RoBERTa transformer model performs best on the test dataset with a score of 82.6% and is highly recommended for quality predictions.
arXiv Detail & Related papers (2023-03-13T17:12:03Z) - QuadFormer: Quadruple Transformer for Unsupervised Domain Adaptation in
Power Line Segmentation of Aerial Images [12.840195641761323]
We propose a novel framework designed for domain adaptive semantic segmentation.
The hierarchical quadruple transformer combines cross-attention and self-attention mechanisms to adapt transferable context.
We present two datasets - ARPLSyn and ARPLReal - to further advance research in unsupervised domain adaptive powerline segmentation.
arXiv Detail & Related papers (2022-11-29T03:15:27Z) - Boosting Transformers for Job Expression Extraction and Classification
in a Low-Resource Setting [12.489741131691737]
We present our approaches to tackle the extraction and classification of job expressions in Spanish texts.
As neither language nor domain experts, we experiment with the multilingual XLM-R transformer model.
Our results show strong improvements using these methods by up to 5.3 F1 points compared to a fine-tuned XLM-R model.
arXiv Detail & Related papers (2021-09-17T15:21:02Z) - On the validity of pre-trained transformers for natural language
processing in the software engineering domain [78.32146765053318]
We compare BERT transformer models trained with software engineering data with transformers based on general domain data.
Our results show that for tasks that require understanding of the software engineering context, pre-training with software engineering data is valuable.
arXiv Detail & Related papers (2021-09-10T08:46:31Z) - Applying the Transformer to Character-level Transduction [68.91664610425114]
The transformer has been shown to outperform recurrent neural network-based sequence-to-sequence models in various word-level NLP tasks.
We show that with a large enough batch size, the transformer does indeed outperform recurrent models for character-level tasks.
arXiv Detail & Related papers (2020-05-20T17:25:43Z) - Variational Transformers for Diverse Response Generation [71.53159402053392]
Variational Transformer (VT) is a variational self-attentive feed-forward sequence model.
VT combines the parallelizability and global receptive field computation of the Transformer with the variational nature of the CVAE.
We explore two types of VT: 1) modeling the discourse-level diversity with a global latent variable; and 2) augmenting the Transformer decoder with a sequence of finegrained latent variables.
arXiv Detail & Related papers (2020-03-28T07:48:02Z) - Hierarchical Transformer Network for Utterance-level Emotion Recognition [0.0]
We address some challenges in utter-ance-level emotion recognition (ULER)
Unlike the traditional text classification problem, this task is supported by a limited number of datasets.
We use a pretrained language model bidirectional encoder representa-tions from transformers (BERT) as the lower-level transformer.
In addition, we add speaker embeddings to the model for the first time, which enables our model to capture the in-teraction between speakers.
arXiv Detail & Related papers (2020-02-18T13:44:49Z) - A Simple Baseline to Semi-Supervised Domain Adaptation for Machine
Translation [73.3550140511458]
State-of-the-art neural machine translation (NMT) systems are data-hungry and perform poorly on new domains with no supervised data.
We propose a simple but effect approach to the semi-supervised domain adaptation scenario of NMT.
This approach iteratively trains a Transformer-based NMT model via three training objectives: language modeling, back-translation, and supervised translation.
arXiv Detail & Related papers (2020-01-22T16:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.