Adapting Sentence Transformers for the Aviation Domain
- URL: http://arxiv.org/abs/2305.09556v2
- Date: Wed, 29 Nov 2023 14:45:46 GMT
- Title: Adapting Sentence Transformers for the Aviation Domain
- Authors: Liya Wang, Jason Chou, Dave Rouck, Alex Tien, Diane M Baumgartner
- Abstract summary: We propose a novel approach for adapting sentence transformers for the aviation domain.
Our method is a two-stage process consisting of pre-training followed by fine-tuning.
Our work highlights the importance of domain-specific adaptation in developing high-quality NLP solutions for specialized industries like aviation.
- Score: 0.8437187555622164
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning effective sentence representations is crucial for many Natural
Language Processing (NLP) tasks, including semantic search, semantic textual
similarity (STS), and clustering. While multiple transformer models have been
developed for sentence embedding learning, these models may not perform
optimally when dealing with specialized domains like aviation, which has unique
characteristics such as technical jargon, abbreviations, and unconventional
grammar. Furthermore, the absence of labeled datasets makes it difficult to
train models specifically for the aviation domain. To address these challenges,
we propose a novel approach for adapting sentence transformers for the aviation
domain. Our method is a two-stage process consisting of pre-training followed
by fine-tuning. During pre-training, we use Transformers and Sequential
Denoising AutoEncoder (TSDAE) with aviation text data as input to improve the
initial model performance. Subsequently, we fine-tune our models using a
Natural Language Inference (NLI) dataset in the Sentence Bidirectional Encoder
Representations from Transformers (SBERT) architecture to mitigate overfitting
issues. Experimental results on several downstream tasks show that our adapted
sentence transformers significantly outperform general-purpose transformers,
demonstrating the effectiveness of our approach in capturing the nuances of the
aviation domain. Overall, our work highlights the importance of domain-specific
adaptation in developing high-quality NLP solutions for specialized industries
like aviation.
Related papers
- OT-Transformer: A Continuous-time Transformer Architecture with Optimal Transport Regularization [1.7180235064112577]
We consider a dynamical system whose governing equation is parametrized by transformer blocks.
We leverage optimal transport theory to regularize the training problem, which enhances stability in training and improves generalization of the resulting model.
arXiv Detail & Related papers (2025-01-30T22:52:40Z) - Coordinate In and Value Out: Training Flow Transformers in Ambient Space [6.911507447184487]
Ambient Space Flow Transformers (ASFT) is a domain-agnostic approach to learn flow matching transformers in ambient space.
We introduce a conditionally independent point-wise training objective that enables ASFT to make predictions continuously in coordinate space.
arXiv Detail & Related papers (2024-12-05T01:00:07Z) - Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z) - TADA: Efficient Task-Agnostic Domain Adaptation for Transformers [3.9379577980832843]
In this work, we introduce TADA, a novel task-agnostic domain adaptation method.
Within TADA, we retrain embeddings to learn domain-aware input representations and tokenizers for the transformer encoder.
We conduct experiments with meta-embeddings and newly introduced meta-tokenizers, resulting in one model per task in multi-domain use cases.
arXiv Detail & Related papers (2023-05-22T04:53:59Z) - Transformer-based approaches to Sentiment Detection [55.41644538483948]
We examined the performance of four different types of state-of-the-art transformer models for text classification.
The RoBERTa transformer model performs best on the test dataset with a score of 82.6% and is highly recommended for quality predictions.
arXiv Detail & Related papers (2023-03-13T17:12:03Z) - Boosting Transformers for Job Expression Extraction and Classification
in a Low-Resource Setting [12.489741131691737]
We present our approaches to tackle the extraction and classification of job expressions in Spanish texts.
As neither language nor domain experts, we experiment with the multilingual XLM-R transformer model.
Our results show strong improvements using these methods by up to 5.3 F1 points compared to a fine-tuned XLM-R model.
arXiv Detail & Related papers (2021-09-17T15:21:02Z) - On the validity of pre-trained transformers for natural language
processing in the software engineering domain [78.32146765053318]
We compare BERT transformer models trained with software engineering data with transformers based on general domain data.
Our results show that for tasks that require understanding of the software engineering context, pre-training with software engineering data is valuable.
arXiv Detail & Related papers (2021-09-10T08:46:31Z) - Applying the Transformer to Character-level Transduction [68.91664610425114]
The transformer has been shown to outperform recurrent neural network-based sequence-to-sequence models in various word-level NLP tasks.
We show that with a large enough batch size, the transformer does indeed outperform recurrent models for character-level tasks.
arXiv Detail & Related papers (2020-05-20T17:25:43Z) - Variational Transformers for Diverse Response Generation [71.53159402053392]
Variational Transformer (VT) is a variational self-attentive feed-forward sequence model.
VT combines the parallelizability and global receptive field computation of the Transformer with the variational nature of the CVAE.
We explore two types of VT: 1) modeling the discourse-level diversity with a global latent variable; and 2) augmenting the Transformer decoder with a sequence of finegrained latent variables.
arXiv Detail & Related papers (2020-03-28T07:48:02Z) - Hierarchical Transformer Network for Utterance-level Emotion Recognition [0.0]
We address some challenges in utter-ance-level emotion recognition (ULER)
Unlike the traditional text classification problem, this task is supported by a limited number of datasets.
We use a pretrained language model bidirectional encoder representa-tions from transformers (BERT) as the lower-level transformer.
In addition, we add speaker embeddings to the model for the first time, which enables our model to capture the in-teraction between speakers.
arXiv Detail & Related papers (2020-02-18T13:44:49Z) - A Simple Baseline to Semi-Supervised Domain Adaptation for Machine
Translation [73.3550140511458]
State-of-the-art neural machine translation (NMT) systems are data-hungry and perform poorly on new domains with no supervised data.
We propose a simple but effect approach to the semi-supervised domain adaptation scenario of NMT.
This approach iteratively trains a Transformer-based NMT model via three training objectives: language modeling, back-translation, and supervised translation.
arXiv Detail & Related papers (2020-01-22T16:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.