Related papers: Bilingual BSARD: Extending Statutory Article Retrieval to Dutch

Bilingual BSARD: Extending Statutory Article Retrieval to Dutch

URL: http://arxiv.org/abs/2412.07462v1
Date: Tue, 10 Dec 2024 12:31:33 GMT
Title: Bilingual BSARD: Extending Statutory Article Retrieval to Dutch
Authors: Ehsan Lotfi, Nikolay Banar, Nerses Yuzbashyan, Walter Daelemans,
Abstract summary: This dataset contains parallel Belgian statutory articles in both French and Dutch, along with legal questions from BSARD and their Dutch translation.<n>We conduct extensive benchmarking of retrieval models available for Dutch and French.<n>Our experiments show that BM25 remains a competitive baseline compared to many zero-shot dense models in both languages.
Score: 3.11149191866066
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Statutory article retrieval plays a crucial role in making legal information more accessible to both laypeople and legal professionals. Multilingual countries like Belgium present unique challenges for retrieval models due to the need for handling legal issues in multiple languages. Building on the Belgian Statutory Article Retrieval Dataset (BSARD) in French, we introduce the bilingual version of this dataset, bBSARD. The dataset contains parallel Belgian statutory articles in both French and Dutch, along with legal questions from BSARD and their Dutch translation. Using bBSARD, we conduct extensive benchmarking of retrieval models available for Dutch and French. Our benchmarking setup includes lexical models, zero-shot dense models, and fine-tuned small foundation models. Our experiments show that BM25 remains a competitive baseline compared to many zero-shot dense models in both languages. We also observe that while proprietary models outperform open alternatives in the zero-shot setting, they can be matched or surpassed by fine-tuning small language-specific models. Our dataset and evaluation code are publicly available.

Related papers

M-Prometheus: A Suite of Open Multilingual LLM Judges [64.22940792713713]
We introduce M-Prometheus, a suite of open-weight LLM judges that can provide both direct assessment and pairwise comparison feedback on multilingual outputs. M-Prometheus models outperform state-of-the-art open LLM judges on multilingual reward benchmarks spanning more than 20 languages, as well as on literary machine translation (MT) evaluation covering 4 language pairs.
arXiv Detail & Related papers (2025-04-07T11:37:26Z)
BEIR-NL: Zero-shot Information Retrieval Benchmark for the Dutch Language [3.3990813930813997]
We introduce BEIR-NL by automatically translating the publicly accessible BEIR datasets into Dutch. We evaluate a wide range of multilingual dense ranking and reranking models, as well as the lexical BM25 method.
arXiv Detail & Related papers (2024-12-11T12:15:57Z)
Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora. But can these models relate corresponding concepts across languages, i.e., be crosslingual? This study evaluates state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z)
CroissantLLM: A Truly Bilingual French-English Language Model [42.03897426049679]
We introduce CroissantLLM, a 1.3B language model pretrained on a set of 3T English and French tokens. We pioneer the approach of training an intrinsically bilingual model with a 1:1 English-to-French pretraining data ratio. To assess performance outside of English, we craft a novel benchmark, FrenchBench.
arXiv Detail & Related papers (2024-02-01T17:17:55Z)
Language Resources for Dutch Large Language Modelling [0.0]
We introduce two fine-tuned variants of the Llama 2 13B model. We provide a leaderboard to keep track of the performance of (Dutch) models on a number of generation tasks.
arXiv Detail & Related papers (2023-12-20T09:06:06Z)
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants [80.4837840962273]
We present Belebele, a dataset spanning 122 language variants. This dataset enables the evaluation of text models in high-, medium-, and low-resource languages.
arXiv Detail & Related papers (2023-08-31T17:43:08Z)
Cross-Lingual NER for Financial Transaction Data in Low-Resource Languages [70.25418443146435]
We propose an efficient modeling framework for cross-lingual named entity recognition in semi-structured text data. We employ two independent datasets of SMSs in English and Arabic, each carrying semi-structured banking transaction information. With access to only 30 labeled samples, our model can generalize the recognition of merchants, amounts, and other fields from English to Arabic.
arXiv Detail & Related papers (2023-07-16T00:45:42Z)
DUMB: A Benchmark for Smart Evaluation of Dutch Models [23.811515104842826]
We introduce the Dutch Model Benchmark: DUMB. The benchmark includes a diverse set of datasets for low-, medium- and high-resource tasks. Relative Error Reduction (RER) compares the DUMB performance of language models to a strong baseline. Highest performance is achieved by DeBERTaV3 (large), XLM-R (large) and mDeBERTaV3 (base)
arXiv Detail & Related papers (2023-05-22T13:27:37Z)
MultiLegalSBD: A Multilingual Legal Sentence Boundary Detection Dataset [0.0]
Sentence Boundary Detection (SBD) is one of the foundational building blocks of Natural Language Processing (NLP) We curated a diverse multilingual legal dataset consisting of over 130'000 annotated sentences in 6 languages. We trained and tested monolingual and multilingual models based on CRF, BiLSTM-CRF, and transformers, demonstrating state-of-the-art performance.
arXiv Detail & Related papers (2023-05-02T05:52:03Z)
Multi-lingual Evaluation of Code Generation Models [82.7357812992118]
We present new benchmarks on evaluation code generation models: MBXP and Multilingual HumanEval, and MathQA-X. These datasets cover over 10 programming languages. We are able to assess the performance of code generation models in a multi-lingual fashion.
arXiv Detail & Related papers (2022-10-26T17:17:06Z)
PAGnol: An Extra-Large French Generative Model [53.40189314359048]
We introduce PAGnol, a collection of French GPT models. Using scaling laws, we efficiently train PAGnol-XL with the same computational budget as CamemBERT.
arXiv Detail & Related papers (2021-10-16T11:44:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.