Bilingual BSARD: Extending Statutory Article Retrieval to Dutch
- URL: http://arxiv.org/abs/2412.07462v1
- Date: Tue, 10 Dec 2024 12:31:33 GMT
- Title: Bilingual BSARD: Extending Statutory Article Retrieval to Dutch
- Authors: Ehsan Lotfi, Nikolay Banar, Nerses Yuzbashyan, Walter Daelemans,
- Abstract summary: This dataset contains parallel Belgian statutory articles in both French and Dutch, along with legal questions from BSARD and their Dutch translation.
We conduct extensive benchmarking of retrieval models available for Dutch and French.
Our experiments show that BM25 remains a competitive baseline compared to many zero-shot dense models in both languages.
- Score: 3.11149191866066
- License:
- Abstract: Statutory article retrieval plays a crucial role in making legal information more accessible to both laypeople and legal professionals. Multilingual countries like Belgium present unique challenges for retrieval models due to the need for handling legal issues in multiple languages. Building on the Belgian Statutory Article Retrieval Dataset (BSARD) in French, we introduce the bilingual version of this dataset, bBSARD. The dataset contains parallel Belgian statutory articles in both French and Dutch, along with legal questions from BSARD and their Dutch translation. Using bBSARD, we conduct extensive benchmarking of retrieval models available for Dutch and French. Our benchmarking setup includes lexical models, zero-shot dense models, and fine-tuned small foundation models. Our experiments show that BM25 remains a competitive baseline compared to many zero-shot dense models in both languages. We also observe that while proprietary models outperform open alternatives in the zero-shot setting, they can be matched or surpassed by fine-tuning small language-specific models. Our dataset and evaluation code are publicly available.
Related papers
- BEIR-NL: Zero-shot Information Retrieval Benchmark for the Dutch Language [3.3990813930813997]
We introduce BEIR-NL by automatically translating the publicly accessible BEIR datasets into Dutch.
We evaluate a wide range of multilingual dense ranking and reranking models, as well as the lexical BM25 method.
arXiv Detail & Related papers (2024-12-11T12:15:57Z) - CroissantLLM: A Truly Bilingual French-English Language Model [42.03897426049679]
We introduce CroissantLLM, a 1.3B language model pretrained on a set of 3T English and French tokens.
We pioneer the approach of training an intrinsically bilingual model with a 1:1 English-to-French pretraining data ratio.
To assess performance outside of English, we craft a novel benchmark, FrenchBench.
arXiv Detail & Related papers (2024-02-01T17:17:55Z) - Language Resources for Dutch Large Language Modelling [0.0]
We introduce two fine-tuned variants of the Llama 2 13B model.
We provide a leaderboard to keep track of the performance of (Dutch) models on a number of generation tasks.
arXiv Detail & Related papers (2023-12-20T09:06:06Z) - The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants [80.4837840962273]
We present Belebele, a dataset spanning 122 language variants.
This dataset enables the evaluation of text models in high-, medium-, and low-resource languages.
arXiv Detail & Related papers (2023-08-31T17:43:08Z) - Cross-Lingual NER for Financial Transaction Data in Low-Resource
Languages [70.25418443146435]
We propose an efficient modeling framework for cross-lingual named entity recognition in semi-structured text data.
We employ two independent datasets of SMSs in English and Arabic, each carrying semi-structured banking transaction information.
With access to only 30 labeled samples, our model can generalize the recognition of merchants, amounts, and other fields from English to Arabic.
arXiv Detail & Related papers (2023-07-16T00:45:42Z) - BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual
Transfer [81.5984433881309]
We introduce BUFFET, which unifies 15 diverse tasks across 54 languages in a sequence-to-sequence format.
BUFFET is designed to establish a rigorous and equitable evaluation framework for few-shot cross-lingual transfer.
Our findings reveal significant room for improvement in few-shot in-context cross-lingual transfer.
arXiv Detail & Related papers (2023-05-24T08:06:33Z) - DUMB: A Benchmark for Smart Evaluation of Dutch Models [23.811515104842826]
We introduce the Dutch Model Benchmark: DUMB. The benchmark includes a diverse set of datasets for low-, medium- and high-resource tasks.
Relative Error Reduction (RER) compares the DUMB performance of language models to a strong baseline.
Highest performance is achieved by DeBERTaV3 (large), XLM-R (large) and mDeBERTaV3 (base)
arXiv Detail & Related papers (2023-05-22T13:27:37Z) - MultiLegalSBD: A Multilingual Legal Sentence Boundary Detection Dataset [0.0]
Sentence Boundary Detection (SBD) is one of the foundational building blocks of Natural Language Processing (NLP)
We curated a diverse multilingual legal dataset consisting of over 130'000 annotated sentences in 6 languages.
We trained and tested monolingual and multilingual models based on CRF, BiLSTM-CRF, and transformers, demonstrating state-of-the-art performance.
arXiv Detail & Related papers (2023-05-02T05:52:03Z) - Multi-lingual Evaluation of Code Generation Models [82.7357812992118]
We present new benchmarks on evaluation code generation models: MBXP and Multilingual HumanEval, and MathQA-X.
These datasets cover over 10 programming languages.
We are able to assess the performance of code generation models in a multi-lingual fashion.
arXiv Detail & Related papers (2022-10-26T17:17:06Z) - PAGnol: An Extra-Large French Generative Model [53.40189314359048]
We introduce PAGnol, a collection of French GPT models.
Using scaling laws, we efficiently train PAGnol-XL with the same computational budget as CamemBERT.
arXiv Detail & Related papers (2021-10-16T11:44:23Z) - Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context.
We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.