Related papers: Adapting Sentence Transformers for the Aviation Domain

Adapting Sentence Transformers for the Aviation Domain

URL: http://arxiv.org/abs/2305.09556v2
Date: Wed, 29 Nov 2023 14:45:46 GMT
Title: Adapting Sentence Transformers for the Aviation Domain
Authors: Liya Wang, Jason Chou, Dave Rouck, Alex Tien, Diane M Baumgartner
Abstract summary: We propose a novel approach for adapting sentence transformers for the aviation domain. Our method is a two-stage process consisting of pre-training followed by fine-tuning. Our work highlights the importance of domain-specific adaptation in developing high-quality NLP solutions for specialized industries like aviation.
Score: 0.8437187555622164
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learning effective sentence representations is crucial for many Natural Language Processing (NLP) tasks, including semantic search, semantic textual similarity (STS), and clustering. While multiple transformer models have been developed for sentence embedding learning, these models may not perform optimally when dealing with specialized domains like aviation, which has unique characteristics such as technical jargon, abbreviations, and unconventional grammar. Furthermore, the absence of labeled datasets makes it difficult to train models specifically for the aviation domain. To address these challenges, we propose a novel approach for adapting sentence transformers for the aviation domain. Our method is a two-stage process consisting of pre-training followed by fine-tuning. During pre-training, we use Transformers and Sequential Denoising AutoEncoder (TSDAE) with aviation text data as input to improve the initial model performance. Subsequently, we fine-tune our models using a Natural Language Inference (NLI) dataset in the Sentence Bidirectional Encoder Representations from Transformers (SBERT) architecture to mitigate overfitting issues. Experimental results on several downstream tasks show that our adapted sentence transformers significantly outperform general-purpose transformers, demonstrating the effectiveness of our approach in capturing the nuances of the aviation domain. Overall, our work highlights the importance of domain-specific adaptation in developing high-quality NLP solutions for specialized industries like aviation.

Related papers

OT-Transformer: A Continuous-time Transformer Architecture with Optimal Transport Regularization [1.7180235064112577]
We consider a dynamical system whose governing equation is parametrized by transformer blocks. We leverage optimal transport theory to regularize the training problem, which enhances stability in training and improves generalization of the resulting model.
arXiv Detail & Related papers (2025-01-30T22:52:40Z)
Coordinate In and Value Out: Training Flow Transformers in Ambient Space [6.911507447184487]
Ambient Space Flow Transformers (ASFT) is a domain-agnostic approach to learn flow matching transformers in ambient space. We introduce a conditionally independent point-wise training objective that enables ASFT to make predictions continuously in coordinate space.
arXiv Detail & Related papers (2024-12-05T01:00:07Z)
Nonparametric Variational Regularisation of Pretrained Transformers [15.313475675235843]
We propose Non Variational Information Bottleneck (NVIB) as a regulariser for training cross-attention in Transformers. We show that changing the initialisation introduces a novel, information-theoretic post-training regularisation in the attention mechanism.
arXiv Detail & Related papers (2023-12-01T15:40:30Z)
Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches. This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z)
TADA: Efficient Task-Agnostic Domain Adaptation for Transformers [3.9379577980832843]
In this work, we introduce TADA, a novel task-agnostic domain adaptation method. Within TADA, we retrain embeddings to learn domain-aware input representations and tokenizers for the transformer encoder. We conduct experiments with meta-embeddings and newly introduced meta-tokenizers, resulting in one model per task in multi-domain use cases.
arXiv Detail & Related papers (2023-05-22T04:53:59Z)
Transformer-based approaches to Sentiment Detection [55.41644538483948]
We examined the performance of four different types of state-of-the-art transformer models for text classification. The RoBERTa transformer model performs best on the test dataset with a score of 82.6% and is highly recommended for quality predictions.
arXiv Detail & Related papers (2023-03-13T17:12:03Z)
QuadFormer: Quadruple Transformer for Unsupervised Domain Adaptation in Power Line Segmentation of Aerial Images [12.840195641761323]
We propose a novel framework designed for domain adaptive semantic segmentation. The hierarchical quadruple transformer combines cross-attention and self-attention mechanisms to adapt transferable context. We present two datasets - ARPLSyn and ARPLReal - to further advance research in unsupervised domain adaptive powerline segmentation.
arXiv Detail & Related papers (2022-11-29T03:15:27Z)
Boosting Transformers for Job Expression Extraction and Classification in a Low-Resource Setting [12.489741131691737]
We present our approaches to tackle the extraction and classification of job expressions in Spanish texts. As neither language nor domain experts, we experiment with the multilingual XLM-R transformer model. Our results show strong improvements using these methods by up to 5.3 F1 points compared to a fine-tuned XLM-R model.
arXiv Detail & Related papers (2021-09-17T15:21:02Z)
On the validity of pre-trained transformers for natural language processing in the software engineering domain [78.32146765053318]
We compare BERT transformer models trained with software engineering data with transformers based on general domain data. Our results show that for tasks that require understanding of the software engineering context, pre-training with software engineering data is valuable.
arXiv Detail & Related papers (2021-09-10T08:46:31Z)
Applying the Transformer to Character-level Transduction [68.91664610425114]
The transformer has been shown to outperform recurrent neural network-based sequence-to-sequence models in various word-level NLP tasks. We show that with a large enough batch size, the transformer does indeed outperform recurrent models for character-level tasks.
arXiv Detail & Related papers (2020-05-20T17:25:43Z)
Variational Transformers for Diverse Response Generation [71.53159402053392]
Variational Transformer (VT) is a variational self-attentive feed-forward sequence model. VT combines the parallelizability and global receptive field computation of the Transformer with the variational nature of the CVAE. We explore two types of VT: 1) modeling the discourse-level diversity with a global latent variable; and 2) augmenting the Transformer decoder with a sequence of finegrained latent variables.
arXiv Detail & Related papers (2020-03-28T07:48:02Z)
Hierarchical Transformer Network for Utterance-level Emotion Recognition [0.0]
We address some challenges in utter-ance-level emotion recognition (ULER) Unlike the traditional text classification problem, this task is supported by a limited number of datasets. We use a pretrained language model bidirectional encoder representa-tions from transformers (BERT) as the lower-level transformer. In addition, we add speaker embeddings to the model for the first time, which enables our model to capture the in-teraction between speakers.
arXiv Detail & Related papers (2020-02-18T13:44:49Z)
A Simple Baseline to Semi-Supervised Domain Adaptation for Machine Translation [73.3550140511458]
State-of-the-art neural machine translation (NMT) systems are data-hungry and perform poorly on new domains with no supervised data. We propose a simple but effect approach to the semi-supervised domain adaptation scenario of NMT. This approach iteratively trains a Transformer-based NMT model via three training objectives: language modeling, back-translation, and supervised translation.
arXiv Detail & Related papers (2020-01-22T16:42:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.