Adversarial Adaptation for French Named Entity Recognition
- URL: http://arxiv.org/abs/2301.05220v1
- Date: Thu, 12 Jan 2023 18:58:36 GMT
- Title: Adversarial Adaptation for French Named Entity Recognition
- Authors: Arjun Choudhry, Inder Khatri, Pankaj Gupta, Aaryan Gupta, Maxime
Nicol, Marie-Jean Meurs, Dinesh Kumar Vishwakarma
- Abstract summary: We propose a Transformer-based NER approach for French, using adversarial adaptation to similar domain or general corpora.
Our approach allows learning better features using large-scale unlabeled corpora from the same domain or mixed domains.
We also show that adversarial adaptation to large-scale unlabeled corpora can help mitigate the performance dip incurred on using Transformer models pre-trained on smaller corpora.
- Score: 21.036698406367115
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Named Entity Recognition (NER) is the task of identifying and classifying
named entities in large-scale texts into predefined classes. NER in French and
other relatively limited-resource languages cannot always benefit from
approaches proposed for languages like English due to a dearth of large, robust
datasets. In this paper, we present our work that aims to mitigate the effects
of this dearth of large, labeled datasets. We propose a Transformer-based NER
approach for French, using adversarial adaptation to similar domain or general
corpora to improve feature extraction and enable better generalization. Our
approach allows learning better features using large-scale unlabeled corpora
from the same domain or mixed domains to introduce more variations during
training and reduce overfitting. Experimental results on three labeled datasets
show that our adaptation framework outperforms the corresponding non-adaptive
models for various combinations of Transformer models, source datasets, and
target corpora. We also show that adversarial adaptation to large-scale
unlabeled corpora can help mitigate the performance dip incurred on using
Transformer models pre-trained on smaller corpora.
Related papers
- Efficient Language Model Architectures for Differentially Private
Federated Learning [21.280600854272716]
Cross-device federated learning (FL) is a technique that trains a model on data distributed across typically millions of edge devices without data leaving the devices.
In centralized training of language models, adaptives are preferred as they offer improved stability and performance.
We propose a scale-in Coupled Input Forget Gate (SI CIFG) recurrent network by modifying the sigmoid and tanh activations in neural recurrent cell.
arXiv Detail & Related papers (2024-03-12T22:21:48Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - Transformer-Based Named Entity Recognition for French Using Adversarial
Adaptation to Similar Domain Corpora [21.036698406367115]
We propose a transformer-based NER approach for French using adversarial adaptation to similar domain or general corpora.
We evaluate our approach on three labelled datasets and show that our adaptation framework outperforms the corresponding non-adaptive models.
arXiv Detail & Related papers (2022-12-05T23:33:36Z) - Domain Adaptation Principal Component Analysis: base linear method for
learning with out-of-distribution data [55.41644538483948]
Domain adaptation is a popular paradigm in modern machine learning.
We present a method called Domain Adaptation Principal Component Analysis (DAPCA)
DAPCA finds a linear reduced data representation useful for solving the domain adaptation task.
arXiv Detail & Related papers (2022-08-28T21:10:56Z) - MemSAC: Memory Augmented Sample Consistency for Large Scale Unsupervised
Domain Adaptation [71.4942277262067]
We propose MemSAC, which exploits sample level similarity across source and target domains to achieve discriminative transfer.
We provide in-depth analysis and insights into the effectiveness of MemSAC.
arXiv Detail & Related papers (2022-07-25T17:55:28Z) - Exploiting Local and Global Features in Transformer-based Extreme
Multi-label Text Classification [28.28186933768281]
We propose an approach that combines both the local and global features produced by Transformer models to improve the prediction power of the classifier.
Our experiments show that the proposed model either outperforms or is comparable to the state-of-the-art methods on benchmark datasets.
arXiv Detail & Related papers (2022-04-02T19:55:23Z) - Improving Classifier Training Efficiency for Automatic Cyberbullying
Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods.
We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments.
The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z) - Dynamic Data Selection and Weighting for Iterative Back-Translation [116.14378571769045]
We propose a curriculum learning strategy for iterative back-translation models.
We evaluate our models on domain adaptation, low-resource, and high-resource MT settings.
Experimental results demonstrate that our methods achieve improvements of up to 1.8 BLEU points over competitive baselines.
arXiv Detail & Related papers (2020-04-07T19:49:58Z) - Adaptive Name Entity Recognition under Highly Unbalanced Data [5.575448433529451]
We present our experiments on a neural architecture composed of a Conditional Random Field (CRF) layer stacked on top of a Bi-directional LSTM (BI-LSTM) layer for solving NER tasks.
We introduce an add-on classification model to split sentences into two different sets: Weak and Strong classes and then designing a couple of Bi-LSTM-CRF models properly to optimize performance on each set.
arXiv Detail & Related papers (2020-03-10T06:56:52Z) - Supervised Domain Adaptation using Graph Embedding [86.3361797111839]
Domain adaptation methods assume that distributions between the two domains are shifted and attempt to realign them.
We propose a generic framework based on graph embedding.
We show that the proposed approach leads to a powerful Domain Adaptation framework.
arXiv Detail & Related papers (2020-03-09T12:25:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.