MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with Open-domain Information Extraction Large Language Models
- URL: http://arxiv.org/abs/2407.10953v2
- Date: Sat, 17 Aug 2024 06:59:05 GMT
- Title: MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with Open-domain Information Extraction Large Language Models
- Authors: Chengguang Gan, Qingyu Yin, Xinyang He, Hanjun Wei, Yunhao Liang, Younghun Lim, Shijian Wang, Hexiang Huang, Qinghao Zhang, Shiwen Ni, Tatsunori Mori,
- Abstract summary: We introduce a Multilingual MRE mix dataset (MMM) that encompasses 21 sub-datasets in English, Japanese, and Chinese.
We also propose a method for dataset translation assisted by Large Language Models (LLMs)
We develop a unified input-output framework to train an Open-domain Information Extraction Large Language Model (OIELLM)
- Score: 10.242002062961083
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Mutual Reinforcement Effect (MRE) represents a promising avenue in information extraction and multitasking research. Nevertheless, its applicability has been constrained due to the exclusive availability of MRE mix datasets in Japanese, thereby limiting comprehensive exploration by the global research community. To address this limitation, we introduce a Multilingual MRE mix dataset (MMM) that encompasses 21 sub-datasets in English, Japanese, and Chinese. In this paper, we also propose a method for dataset translation assisted by Large Language Models (LLMs), which significantly reduces the manual annotation time required for dataset construction by leveraging LLMs to translate the original Japanese datasets. Additionally, we have enriched the dataset by incorporating open-domain Named Entity Recognition (NER) and sentence classification tasks. Utilizing this expanded dataset, we developed a unified input-output framework to train an Open-domain Information Extraction Large Language Model (OIELLM). The OIELLM model demonstrates the capability to effectively process novel MMM datasets, exhibiting significant improvements in performance.
Related papers
- Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data.
We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation.
Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z) - Few-Shot Joint Multimodal Entity-Relation Extraction via Knowledge-Enhanced Cross-modal Prompt Model [16.03304915788997]
Joint Multimodal Entity-Relation Extraction (JMERE) is a challenging task that aims to extract entities and their relations from text-image pairs in social media posts.
Existing methods for JMERE require large amounts of labeled data.
We introduce the textbfKnowledge-textbfEnhanced textbfCross-modal textbfPrompt textbfModel.
arXiv Detail & Related papers (2024-10-18T07:14:54Z) - Towards Zero-Shot Multimodal Machine Translation [64.9141931372384]
We propose a method to bypass the need for fully supervised data to train multimodal machine translation systems.
Our method, called ZeroMMT, consists in adapting a strong text-only machine translation (MT) model by training it on a mixture of two objectives.
To prove that our method generalizes to languages with no fully supervised training data available, we extend the CoMMuTE evaluation dataset to three new languages: Arabic, Russian and Chinese.
arXiv Detail & Related papers (2024-07-18T15:20:31Z) - 3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset [90.95948101052073]
We introduce 3AM, an ambiguity-aware MMT dataset comprising 26,000 parallel sentence pairs in English and Chinese.
Our dataset is specifically designed to include more ambiguity and a greater variety of both captions and images than other MMT datasets.
Experimental results show that MMT models trained on our dataset exhibit a greater ability to exploit visual information than those trained on other MMT datasets.
arXiv Detail & Related papers (2024-04-29T04:01:30Z) - GeMQuAD : Generating Multilingual Question Answering Datasets from Large Language Models using Few Shot Learning [4.8838210812204235]
In this paper, we propose GeMQuAD - a semi-supervised learning approach, applied to a dataset generated through ICL with just one example in the target language.
We iteratively identify high-quality data to enhance model performance, especially for low-resource multilingual setting.
Our framework outperforms the machine translation-augmented model by 0.22/1.68 F1/EM points for Hindi and 0.82/1.37 F1/EM points for Spanish on the MLQA dataset.
arXiv Detail & Related papers (2024-04-14T06:55:42Z) - ExaRanker-Open: Synthetic Explanation for IR using Open-Source LLMs [60.81649785463651]
We introduce ExaRanker-Open, where we adapt and explore the use of open-source language models to generate explanations.
Our findings reveal that incorporating explanations consistently enhances neural rankers, with benefits escalating as the LLM size increases.
arXiv Detail & Related papers (2024-02-09T11:23:14Z) - Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes [57.62036621319563]
We introduce CLLM, which leverages the prior knowledge of Large Language Models (LLMs) for data augmentation in the low-data regime.
We demonstrate the superior performance of CLLM in the low-data regime compared to conventional generators.
arXiv Detail & Related papers (2023-12-19T12:34:46Z) - GIELLM: Japanese General Information Extraction Large Language Model
Utilizing Mutual Reinforcement Effect [0.0]
We introduce the General Information Extraction Large Language Model (GIELLM)
It integrates text Classification, Sentiment Analysis, Named Entity Recognition, Relation Extraction, and Event Extraction using a uniform input-output schema.
This innovation marks the first instance of a model simultaneously handling such a diverse array of IE subtasks.
arXiv Detail & Related papers (2023-11-12T13:30:38Z) - Unified Model Learning for Various Neural Machine Translation [63.320005222549646]
Existing machine translation (NMT) studies mainly focus on developing dataset-specific models.
We propose a versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks.
OurNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs.
arXiv Detail & Related papers (2023-05-04T12:21:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.