GERNERMED++: Transfer Learning in German Medical NLP
- URL: http://arxiv.org/abs/2206.14504v1
- Date: Wed, 29 Jun 2022 09:53:10 GMT
- Title: GERNERMED++: Transfer Learning in German Medical NLP
- Authors: Johann Frei, Ludwig Frei-Stuber, Frank Kramer
- Abstract summary: We present a statistical model for German medical natural language processing trained for named entity recognition (NER)
The work serves as a refined successor to our first GERNERMED model which is substantially outperformed by our work.
- Score: 0.46408356903366527
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We present a statistical model for German medical natural language processing
trained for named entity recognition (NER) as an open, publicly available
model. The work serves as a refined successor to our first GERNERMED model
which is substantially outperformed by our work. We demonstrate the
effectiveness of combining multiple techniques in order to achieve strong
results in entity recognition performance by the means of transfer-learning on
pretrained deep language models (LM), word-alignment and neural machine
translation. Due to the sparse situation on open, public medical entity
recognition models for German texts, this work offers benefits to the German
research community on medical NLP as a baseline model. Since our model is based
on public English data, its weights are provided without legal restrictions on
usage and distribution. The sample code and the statistical model is available
at: https://github.com/frankkramer-lab/GERNERMED-pp
Related papers
- Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding [16.220303664681172]
We pre-trained several German medical language models on 2.4B tokens derived from translated public English medical data and 3B tokens of German clinical data.
The resulting models were evaluated on various German downstream tasks, including named entity recognition (NER), multi-label classification, and extractive question answering.
We conclude that continuous pre-training has demonstrated the ability to match or even exceed the performance of clinical models trained from scratch.
arXiv Detail & Related papers (2024-04-08T17:24:04Z) - On Significance of Subword tokenization for Low Resource and Efficient
Named Entity Recognition: A case study in Marathi [1.6383036433216434]
We focus on NER for low-resource language and present our case study in the context of the Indian language Marathi.
We propose a hybrid approach for efficient NER by integrating a BERT-based subword tokenizer into vanilla CNN/LSTM models.
We show that this simple approach of replacing a traditional word-based tokenizer with a BERT-tokenizer brings the accuracy of vanilla single-layer models closer to that of deep pre-trained models like BERT.
arXiv Detail & Related papers (2023-12-03T06:53:53Z) - Cross-Lingual NER for Financial Transaction Data in Low-Resource
Languages [70.25418443146435]
We propose an efficient modeling framework for cross-lingual named entity recognition in semi-structured text data.
We employ two independent datasets of SMSs in English and Arabic, each carrying semi-structured banking transaction information.
With access to only 30 labeled samples, our model can generalize the recognition of merchants, amounts, and other fields from English to Arabic.
arXiv Detail & Related papers (2023-07-16T00:45:42Z) - German BERT Model for Legal Named Entity Recognition [0.43461794560295636]
We fine-tune a popular BERT language model trained on German data (German BERT) on a Legal Entity Recognition (LER) dataset.
The results we achieve by fine-tuning German BERT on the LER dataset outperform the BiLSTM-CRF+ model used by the authors of the same LER dataset.
arXiv Detail & Related papers (2023-03-07T11:54:39Z) - Better Datastore, Better Translation: Generating Datastores from
Pre-Trained Models for Nearest Neural Machine Translation [48.58899349349702]
Nearest Neighbor Machine Translation (kNNMT) is a simple and effective method of augmenting neural machine translation (NMT) with a token-level nearest neighbor retrieval mechanism.
In this paper, we propose PRED, a framework that leverages Pre-trained models for Datastores in kNN-MT.
arXiv Detail & Related papers (2022-12-17T08:34:20Z) - GERNERMED -- An Open German Medical NER Model [0.7310043452300736]
Data mining in the field of medical data analysis often needs to rely solely on processing of unstructured data to retrieve relevant data.
In this work, we present GERNERMED, the first open, neural NLP model for NER tasks dedicated to detect medical entity types in German text data.
arXiv Detail & Related papers (2021-09-24T17:53:47Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z) - Paraphrastic Representations at Scale [134.41025103489224]
We release trained models for English, Arabic, German, French, Spanish, Russian, Turkish, and Chinese languages.
We train these models on large amounts of data, achieving significantly improved performance from the original papers.
arXiv Detail & Related papers (2021-04-30T16:55:28Z) - German's Next Language Model [0.8520624117635327]
We present the experiments which lead to the creation of our BERT and ELECTRA based German language models, GBERT and GELECTRA.
By varying the input training data, model size, and the presence of Whole Word Masking (WWM) we were able to attain SoTA performance across a set of document classification and named entity recognition tasks.
arXiv Detail & Related papers (2020-10-21T11:28:23Z) - Pre-training Multilingual Neural Machine Translation by Leveraging
Alignment Information [72.2412707779571]
mRASP is an approach to pre-train a universal multilingual neural machine translation model.
We carry out experiments on 42 translation directions across a diverse setting, including low, medium, rich resource, and as well as transferring to exotic language pairs.
arXiv Detail & Related papers (2020-10-07T03:57:54Z) - Domain-Specific Language Model Pretraining for Biomedical Natural
Language Processing [73.37262264915739]
We show that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains.
Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks.
arXiv Detail & Related papers (2020-07-31T00:04:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.