Related papers: Few shot clinical entity recognition in three languages: Masked language models outperform LLM prompting

Few shot clinical entity recognition in three languages: Masked language models outperform LLM prompting

URL: http://arxiv.org/abs/2402.12801v1
Date: Tue, 20 Feb 2024 08:20:49 GMT
Title: Few shot clinical entity recognition in three languages: Masked language models outperform LLM prompting
Authors: Marco Naguib, Xavier Tannier, Aur\'elie N\'ev\'eol
Abstract summary: We evaluate named entity recognition in English, French and Spanish using 8 in-domain (clinical) and 6 out-domain gold standard corpora. We create a few-shot set-up by limiting the amount of annotated data available to 100 sentences. Our experiments show that although larger prompt-based models tend to achieve competitive F-measure for named entity recognition outside the clinical domain, this level of performance does not carry over to the clinical domain.
Score: 2.3357645240384874
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large Language Models are becoming the go-to solution for many natural language processing tasks, including in specialized domains where their few-shot capacities are expected to yield high performance in low-resource settings. Herein, we aim to assess the performance of Large Language Models for few shot clinical entity recognition in multiple languages. We evaluate named entity recognition in English, French and Spanish using 8 in-domain (clinical) and 6 out-domain gold standard corpora. We assess the performance of 10 auto-regressive language models using prompting and 16 masked language models used for text encoding in a biLSTM-CRF supervised tagger. We create a few-shot set-up by limiting the amount of annotated data available to 100 sentences. Our experiments show that although larger prompt-based models tend to achieve competitive F-measure for named entity recognition outside the clinical domain, this level of performance does not carry over to the clinical domain where lighter supervised taggers relying on masked language models perform better, even with the performance drop incurred from the few-shot set-up. In all experiments, the CO2 impact of masked language models is inferior to that of auto-regressive models. Results are consistent over the three languages and suggest that few-shot learning using Large language models is not production ready for named entity recognition in the clinical domain. Instead, models could be used for speeding-up the production of gold standard annotated data.

Related papers

Assessing Small Language Models for Code Generation: An Empirical Study with Benchmarks [4.448709087838503]
Small Language Models (SLMs) offer lightweight and cost-effective alternatives to Large Language Models (LLMs)<n>This study presents a comprehensive empirical evaluation of 20 open-source SLMs ranging from 0.4B to 10B parameters on five code-related benchmarks.
arXiv Detail & Related papers (2025-07-03T20:32:36Z)
Multilingual Definition Modeling [1.9409995498330783]
We use monolingual dictionary data for four new languages (Spanish, French, Portuguese, and German)<n>We test the performance of pre-trained multilingual language models on definition modeling of monosemic words when finetuned on this data.<n>Results show that multilingual language models can perform on-pair with English but cannot leverage potential cross-lingual synergies.
arXiv Detail & Related papers (2025-06-02T09:48:37Z)
Learnware of Language Models: Specialized Small Language Models Can Do Big [50.285859986475394]
This paper presents a preliminary attempt to apply the learnware paradigm to language models.<n>We simulated a learnware system comprising approximately 100 learnwares of specialized SLMs with 8B parameters.<n>By selecting one suitable learnware for each task-specific inference, the system outperforms the base SLMs on all benchmarks.
arXiv Detail & Related papers (2025-05-19T17:54:35Z)
Enhancing Code Generation for Low-Resource Languages: No Silver Bullet [55.39571645315926]
Large Language Models (LLMs) rely on large and diverse datasets to learn syntax, semantics, and usage patterns of programming languages. For low-resource languages, the limited availability of such data hampers the models' ability to generalize effectively. We present an empirical study investigating the effectiveness of several approaches for boosting LLMs' performance on low-resource languages.
arXiv Detail & Related papers (2025-01-31T12:23:28Z)
LLMic: Romanian Foundation Language Model [76.09455151754062]
We present LLMic, a foundation language model designed specifically for the Romanian Language. We show that fine-tuning LLMic for language translation after the initial pretraining phase outperforms existing solutions in English-to-Romanian translation tasks.
arXiv Detail & Related papers (2025-01-13T22:14:45Z)
How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario [72.02391485962127]
Speech Self-Supervised Learning (SSL) models achieve impressive performance on Automatic Speech Recognition (ASR) In low-resource language ASR, they encounter the domain mismatch problem between pre-trained and low-resource languages. We extend a conventional efficient fine-tuning scheme based on the adapter to handle these issues.
arXiv Detail & Related papers (2024-11-27T10:51:00Z)
Transformer-Based Contextualized Language Models Joint with Neural Networks for Natural Language Inference in Vietnamese [1.7457686843484872]
We conduct experiments using various combinations of contextualized language models (CLM) and neural networks. We find that the joint approach of CLM and neural networks is simple yet capable of achieving high-quality performance.
arXiv Detail & Related papers (2024-11-20T15:46:48Z)
GEIC: Universal and Multilingual Named Entity Recognition with Large Language Models [7.714969840571947]
We introduce the task of generation-based extraction and in-context classification (GEIC) We then propose CascadeNER, a universal and multilingual GEIC framework for few-shot and zero-shot NER. We also introduce AnythingNER, the first NER dataset specifically designed for Large Language Models (LLMs)
arXiv Detail & Related papers (2024-09-17T09:32:12Z)
Unlocking the Potential of Model Merging for Low-Resource Languages [66.7716891808697]
Adapting large language models to new languages typically involves continual pre-training (CT) followed by supervised fine-tuning (SFT) We propose model merging as an alternative for low-resource languages, combining models with distinct capabilities into a single model without additional training. Experiments based on Llama-2-7B demonstrate that model merging effectively endows LLMs for low-resource languages with task-solving abilities, outperforming CT-then-SFT in scenarios with extremely scarce data.
arXiv Detail & Related papers (2024-07-04T15:14:17Z)
ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models [9.710960283117771]
ProxyLM is a task- and language-agnostic framework designed to predict the performance of LMs using proxy models. ProxyLM significantly reduces computational overhead in task evaluations, achieving up to a 37.08x speedup over traditional methods. Our results demonstrate that ProxyLM not only adapts well to previously unseen languages in pre-trained LMs, but also generalizes effectively across different datasets.
arXiv Detail & Related papers (2024-06-13T17:15:33Z)
Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored. We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z)
CLOMO: Counterfactual Logical Modification with Large Language Models [109.60793869938534]
We introduce a novel task, Counterfactual Logical Modification (CLOMO), and a high-quality human-annotated benchmark. In this task, LLMs must adeptly alter a given argumentative text to uphold a predetermined logical relationship. We propose an innovative evaluation metric, the Self-Evaluation Score (SES), to directly evaluate the natural language output of LLMs.
arXiv Detail & Related papers (2023-11-29T08:29:54Z)
Exploring the Potential of Large Language Models in Computational Argumentation [54.85665903448207]
Large language models (LLMs) have demonstrated impressive capabilities in understanding context and generating natural language. This work aims to embark on an assessment of LLMs, such as ChatGPT, Flan models, and LLaMA2 models, in both zero-shot and few-shot settings.
arXiv Detail & Related papers (2023-11-15T15:12:15Z)
On the Analysis of Cross-Lingual Prompt Tuning for Decoder-based Multilingual Model [49.81429697921861]
We study the interaction between parameter-efficient fine-tuning (PEFT) and cross-lingual tasks in multilingual autoregressive models. We show that prompt tuning is more effective in enhancing the performance of low-resource languages than fine-tuning.
arXiv Detail & Related papers (2023-11-14T00:43:33Z)
How far is Language Model from 100% Few-shot Named Entity Recognition in Medical Domain [14.635536657783613]
This paper aims to compare the performance of LMs in medical few-shot NER and answer How far is LMs from 100% Few-shot NER in Medical Domain. Our findings clearly indicate that LLMs outperform SLMs in few-shot medical NER tasks, given the presence of suitable examples and appropriate logical frameworks. We introduce a simple and effective method called textscRT (Retrieving and Thinking), which serves as retrievers, finding relevant examples, and as thinkers, employing a step-by-step reasoning process.
arXiv Detail & Related papers (2023-07-01T01:18:09Z)
Extrapolating Multilingual Understanding Models as Multilingual Generators [82.1355802012414]
This paper explores methods to empower multilingual understanding models the generation abilities to get a unified model. We propose a textbfSemantic-textbfGuided textbfAlignment-then-Denoising (SGA) approach to adapt an encoder to a multilingual generator with a small number of new parameters.
arXiv Detail & Related papers (2023-05-22T15:33:21Z)
MEGA: Multilingual Evaluation of Generative AI [23.109803506475174]
Generative AI models have shown impressive performance on many Natural Language Processing tasks. Most studies on generative LLMs have been restricted to English. It is unclear how capable these models are at understanding and generating text in other languages.
arXiv Detail & Related papers (2023-03-22T13:03:10Z)
MicroBERT: Effective Training of Low-resource Monolingual BERTs through Parameter Reduction and Multitask Learning [12.640283469603357]
Transformer language models (TLMs) are critical for most NLP tasks, but they are difficult to create for low-resource languages because of how much pretraining data they require. In this work, we investigate two techniques for training monolingual TLMs in a low-resource setting: greatly reducing TLM size, and complementing the masked language modeling objective with two linguistically rich supervised tasks. Results from 7 diverse languages indicate that our model, MicroBERT, is able to produce marked improvements in downstream task evaluations relative to a typical monolingual TLM pretraining approach.
arXiv Detail & Related papers (2022-12-23T18:18:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.