MT-BioNER: Multi-task Learning for Biomedical Named Entity Recognition
using Deep Bidirectional Transformers
- URL: http://arxiv.org/abs/2001.08904v1
- Date: Fri, 24 Jan 2020 07:16:32 GMT
- Title: MT-BioNER: Multi-task Learning for Biomedical Named Entity Recognition
using Deep Bidirectional Transformers
- Authors: Muhammad Raza Khan, Morteza Ziyadi and Mohamed AbdelHady
- Abstract summary: We consider the training of a slot tagger using multiple data sets covering different slot types as a multi-task learning problem.
The experimental results on the biomedical domain have shown that the proposed approach outperforms the previous state-of-the-art systems for slot tagging.
- Score: 1.7403133838762446
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conversational agents such as Cortana, Alexa and Siri are continuously
working on increasing their capabilities by adding new domains. The support of
a new domain includes the design and development of a number of NLU components
for domain classification, intents classification and slots tagging (including
named entity recognition). Each component only performs well when trained on a
large amount of labeled data. Second, these components are deployed on
limited-memory devices which requires some model compression. Third, for some
domains such as the health domain, it is hard to find a single training data
set that covers all the required slot types. To overcome these mentioned
problems, we present a multi-task transformer-based neural architecture for
slot tagging. We consider the training of a slot tagger using multiple data
sets covering different slot types as a multi-task learning problem. The
experimental results on the biomedical domain have shown that the proposed
approach outperforms the previous state-of-the-art systems for slot tagging on
the different benchmark biomedical datasets in terms of (time and memory)
efficiency and effectiveness. The output slot tagger can be used by the
conversational agent to better identify entities in the input utterances.
Related papers
- Prompting Segment Anything Model with Domain-Adaptive Prototype for Generalizable Medical Image Segmentation [49.5901368256326]
We propose a novel Domain-Adaptive Prompt framework for fine-tuning the Segment Anything Model (termed as DAPSAM) in segmenting medical images.
Our DAPSAM achieves state-of-the-art performance on two medical image segmentation tasks with different modalities.
arXiv Detail & Related papers (2024-09-19T07:28:33Z) - MultiADE: A Multi-domain Benchmark for Adverse Drug Event Extraction [11.458594744457521]
Active adverse event surveillance monitors Adverse Drug Events (ADE) from different data sources.
Most datasets or shared tasks focus on extracting ADEs from a particular type of text.
Domain generalisation - the ability of a machine learning model to perform well on new, unseen domains (text types) - is under-explored.
We build a benchmark for adverse drug event extraction, which we named MultiADE.
arXiv Detail & Related papers (2024-05-28T09:57:28Z) - Contrastive Learning and Mixture of Experts Enables Precise Vector Embeddings [0.0]
This paper improves upon the vectors embeddings of scientific literature by assembling niche datasets using co-citations as a similarity metric.
We apply a novel Mixture of Experts (MoE) extension pipeline to pretrained BERT models, where every multi-layer perceptron section is enlarged and copied into multiple distinct experts.
Our MoE variants perform well over $N$ scientific domains with $N$ dedicated experts, whereas standard BERT models excel in only one domain.
arXiv Detail & Related papers (2024-01-28T17:34:42Z) - MDViT: Multi-domain Vision Transformer for Small Medical Image Segmentation Datasets [19.44142290594537]
Vision transformers (ViTs) have emerged as a promising solution to improve medical image segmentation (MIS)
ViTs are typically trained using a single source of data, which overlooks the valuable knowledge that could be leveraged from other available datasets.
In this paper, we propose MDViT, the first multi-domain ViT that includes domain adapters to mitigate data-hunger and combat NKT.
arXiv Detail & Related papers (2023-07-05T08:19:29Z) - Set-based Meta-Interpolation for Few-Task Meta-Learning [79.4236527774689]
We propose a novel domain-agnostic task augmentation method, Meta-Interpolation, to densify the meta-training task distribution.
We empirically validate the efficacy of Meta-Interpolation on eight datasets spanning across various domains.
arXiv Detail & Related papers (2022-05-20T06:53:03Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z) - Streaming Self-Training via Domain-Agnostic Unlabeled Images [62.57647373581592]
We present streaming self-training (SST) that aims to democratize the process of learning visual recognition models.
Key to SST are two crucial observations: (1) domain-agnostic unlabeled images enable us to learn better models with a few labeled examples without any additional knowledge or supervision; and (2) learning is a continuous process and can be done by constructing a schedule of learning updates.
arXiv Detail & Related papers (2021-04-07T17:58:39Z) - Sequential Sentence Classification in Research Papers using Cross-Domain
Multi-Task Learning [4.2443814047515716]
We propose a uniform deep learning architecture and multi-task learning to improve sequential sentence classification in scientific texts across domains.
Our approach outperforms the state of the art on three benchmark datasets.
arXiv Detail & Related papers (2021-02-11T13:54:10Z) - Linguistically-Enriched and Context-Aware Zero-shot Slot Filling [6.06746295810681]
Slot filling is one of the most important challenges in modern task-oriented dialog systems.
New domains (i.e., unseen in training) may emerge after deployment.
It is imperative that models seamlessly adapt and fill slots from both seen and unseen domains.
arXiv Detail & Related papers (2021-01-16T20:18:16Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - mDALU: Multi-Source Domain Adaptation and Label Unification with Partial
Datasets [102.62639692656458]
This paper treats this task as a multi-source domain adaptation and label unification problem.
Our method consists of a partially-supervised adaptation stage and a fully-supervised adaptation stage.
We verify the method on three different tasks, image classification, 2D semantic image segmentation, and joint 2D-3D semantic segmentation.
arXiv Detail & Related papers (2020-12-15T15:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.