Transformer-based Methods for Recognizing Ultra Fine-grained Entities
(RUFES)
- URL: http://arxiv.org/abs/2104.06048v1
- Date: Tue, 13 Apr 2021 09:23:16 GMT
- Title: Transformer-based Methods for Recognizing Ultra Fine-grained Entities
(RUFES)
- Authors: Emanuela Boros and Antoine Doucet
- Abstract summary: This paper summarizes the participation of the Laboratoire Informatique, Image et Interaction (L3i laboratory) of the University of La Rochelle in the Recognizing Ultra Fine-grained Entities (RUFES) track within the Text Analysis Conference (TAC) series of evaluation workshops.
Our participation relies on two neural-based models, one based on a pre-trained and fine-tuned language model with a stack of Transformer layers for fine-grained entity extraction and one out-of-the-box model for within-document entity coreference.
- Score: 1.456207068672607
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper summarizes the participation of the Laboratoire Informatique,
Image et Interaction (L3i laboratory) of the University of La Rochelle in the
Recognizing Ultra Fine-grained Entities (RUFES) track within the Text Analysis
Conference (TAC) series of evaluation workshops. Our participation relies on
two neural-based models, one based on a pre-trained and fine-tuned language
model with a stack of Transformer layers for fine-grained entity extraction and
one out-of-the-box model for within-document entity coreference. We observe
that our approach has great potential in increasing the performance of
fine-grained entity recognition. Thus, the future work envisioned is to enhance
the ability of the models following additional experiments and a deeper
analysis of the results.
Related papers
- An Active Learning Framework for Inclusive Generation by Large Language Models [32.16984263644299]
Large Language Models (LLMs) generate text representative of diverse sub-populations.
We propose a novel clustering-based active learning framework, enhanced with knowledge distillation.
We construct two new datasets in tandem with model training, showing a performance improvement of 2%-10% over baseline models.
arXiv Detail & Related papers (2024-10-17T15:09:35Z) - Integrating Contrastive Learning into a Multitask Transformer Model for
Effective Domain Adaptation [4.157415305926585]
We propose a novel domain adaptation technique that embodies a multitask framework with SER as the primary task.
We show that our proposed model achieves state-of-the-art performance in SER within cross-corpus scenarios.
arXiv Detail & Related papers (2023-10-07T06:41:29Z) - The Languini Kitchen: Enabling Language Modelling Research at Different
Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours.
We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length.
This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z) - Extensive Evaluation of Transformer-based Architectures for Adverse Drug
Events Extraction [6.78974856327994]
Adverse Event (ADE) extraction is one of the core tasks in digital pharmacovigilance.
We evaluate 19 Transformer-based models for ADE extraction on informal texts.
At the end of our analyses, we identify a list of take-home messages that can be derived from the experimental data.
arXiv Detail & Related papers (2023-06-08T15:25:24Z) - UniDiff: Advancing Vision-Language Models with Generative and
Discriminative Learning [86.91893533388628]
This paper presents UniDiff, a unified multi-modal model that integrates image-text contrastive learning (ITC), text-conditioned image synthesis learning (IS), and reciprocal semantic consistency modeling (RSC)
UniDiff demonstrates versatility in both multi-modal understanding and generative tasks.
arXiv Detail & Related papers (2023-06-01T15:39:38Z) - Scaling Vision-Language Models with Sparse Mixture of Experts [128.0882767889029]
We show that mixture-of-experts (MoE) techniques can achieve state-of-the-art performance on a range of benchmarks over dense models of equivalent computational cost.
Our research offers valuable insights into stabilizing the training of MoE models, understanding the impact of MoE on model interpretability, and balancing the trade-offs between compute performance when scaling vision-language models.
arXiv Detail & Related papers (2023-03-13T16:00:31Z) - UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes [91.24112204588353]
We introduce UViM, a unified approach capable of modeling a wide range of computer vision tasks.
In contrast to previous models, UViM has the same functional form for all tasks.
We demonstrate the effectiveness of UViM on three diverse and challenging vision tasks.
arXiv Detail & Related papers (2022-05-20T17:47:59Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - CNRL at SemEval-2020 Task 5: Modelling Causal Reasoning in Language with
Multi-Head Self-Attention Weights based Counterfactual Detection [0.15229257192293202]
We use pre-trained transformer models to extract contextual embeddings and self-attention weights from the text.
We show the use of convolutional layers to extract task-specific features from these self-attention weights.
arXiv Detail & Related papers (2020-05-31T21:02:25Z) - Rethinking Generalization of Neural Models: A Named Entity Recognition
Case Study [81.11161697133095]
We take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives.
Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models.
As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers.
arXiv Detail & Related papers (2020-01-12T04:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.