SandboxAQ's submission to MRL 2024 Shared Task on Multi-lingual Multi-task Information Retrieval
- URL: http://arxiv.org/abs/2410.21501v1
- Date: Mon, 28 Oct 2024 20:15:45 GMT
- Title: SandboxAQ's submission to MRL 2024 Shared Task on Multi-lingual Multi-task Information Retrieval
- Authors: Isidora Chara Tourni, Sayontan Ghosh, Brenda Miao, Constantijn van der Poel,
- Abstract summary: This paper explores the problems of Question Answering (QA) and Named Entity Recognition (NER) in five diverse languages.
We tested five Large Language Models with various prompting methods, including zero-shot, chain-of-thought reasoning, and translation techniques.
Our results show that while some models consistently outperform others, their effectiveness varies significantly across tasks and languages.
- Score: 1.2629889435114405
- License:
- Abstract: This paper explores the problems of Question Answering (QA) and Named Entity Recognition (NER) in five diverse languages. We tested five Large Language Models with various prompting methods, including zero-shot, chain-of-thought reasoning, and translation techniques. Our results show that while some models consistently outperform others, their effectiveness varies significantly across tasks and languages. We saw that advanced prompting techniques generally improved QA performance but had mixed results for NER; and we observed that language difficulty patterns differed between tasks. Our findings highlight the need for task-specific approaches in multilingual NLP and suggest that current models may develop different linguistic competencies for different tasks.
Related papers
- What Is Missing in Multilingual Visual Reasoning and How to Fix It [64.47951359580556]
We evaluate NLP models' multilingual, multimodal capabilities by testing on a visual reasoning task.
proprietary systems like GPT-4V obtain the best performance on this task now, but open models lag in comparison.
Our interventions achieve the best open performance on this task in a zero-shot setting, boosting open model LLaVA by 13.4%.
arXiv Detail & Related papers (2024-03-03T05:45:27Z) - Improving Factuality and Reasoning in Language Models through Multiagent
Debate [95.10641301155232]
We present a complementary approach to improve language responses where multiple language model instances propose and debate their individual responses and reasoning processes over multiple rounds to arrive at a common final answer.
Our findings indicate that this approach significantly enhances mathematical and strategic reasoning across a number of tasks.
Our approach may be directly applied to existing black-box models and uses identical procedure and prompts for all tasks we investigate.
arXiv Detail & Related papers (2023-05-23T17:55:11Z) - Multilingual Large Language Models Are Not (Yet) Code-Switchers [41.47534626749588]
Large Language Models (LLMs) have recently shown great capabilities in a wide range of tasks.
The practice of alternating languages within an utterance remains relatively uncharted.
We argue that current "multilingualism" in LLMs does not inherently imply proficiency with code-switching texts.
arXiv Detail & Related papers (2023-05-23T16:50:48Z) - Not All Languages Are Created Equal in LLMs: Improving Multilingual
Capability by Cross-Lingual-Thought Prompting [123.16452714740106]
Large language models (LLMs) demonstrate impressive multilingual capability, but their performance varies substantially across different languages.
We introduce a simple yet effective method, called cross-lingual-thought prompting (XLT)
XLT is a generic template prompt that stimulates cross-lingual and logical reasoning skills to enhance task performance across languages.
arXiv Detail & Related papers (2023-05-11T17:44:17Z) - Bridging the Language Gap: Knowledge Injected Multilingual Question
Answering [19.768708263635176]
We propose a generalized cross-lingual transfer framework to enhance the model's ability to understand different languages.
Experiment results on real-world datasets MLQA demonstrate that the proposed method can improve the performance by a large margin.
arXiv Detail & Related papers (2023-04-06T15:41:25Z) - Delving Deeper into Cross-lingual Visual Question Answering [115.16614806717341]
We show that simple modifications to the standard training setup can substantially reduce the transfer gap to monolingual English performance.
We analyze cross-lingual VQA across different question types of varying complexity for different multilingual multimodal Transformers.
arXiv Detail & Related papers (2022-02-15T18:22:18Z) - Meta-Learning for Effective Multi-task and Multilingual Modelling [23.53779501937046]
We propose a meta-learning approach to learn the interactions between both tasks and languages.
We present experiments on five different tasks and six different languages from the XTREME multilingual benchmark dataset.
arXiv Detail & Related papers (2021-01-25T19:30:26Z) - CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot
Cross-Lingual NLP [68.2650714613869]
We propose a data augmentation framework to generate multi-lingual code-switching data to fine-tune mBERT.
Compared with the existing work, our method does not rely on bilingual sentences for training, and requires only one training process for multiple target languages.
arXiv Detail & Related papers (2020-06-11T13:15:59Z) - XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating
Cross-lingual Generalization [128.37244072182506]
Cross-lingual TRansfer Evaluation of Multilinguals XTREME is a benchmark for evaluating the cross-lingual generalization capabilities of multilingual representations across 40 languages and 9 tasks.
We demonstrate that while models tested on English reach human performance on many tasks, there is still a sizable gap in the performance of cross-lingually transferred models.
arXiv Detail & Related papers (2020-03-24T19:09:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.