Related papers: No Language is an Island: Unifying Chinese and English in Financial Large Language Models, Instruction Data, and Benchmarks

No Language is an Island: Unifying Chinese and English in Financial Large Language Models, Instruction Data, and Benchmarks

URL: http://arxiv.org/abs/2403.06249v2
Date: Tue, 16 Apr 2024 12:05:33 GMT
Title: No Language is an Island: Unifying Chinese and English in Financial Large Language Models, Instruction Data, and Benchmarks
Authors: Gang Hu, Ke Qin, Chenhan Yuan, Min Peng, Alejandro Lopez-Lira, Benyou Wang, Sophia Ananiadou, Wanlong Yu, Jimin Huang, Qianqian Xie,
Abstract summary: ICE-PIXIU uniquely integrates a spectrum of Chinese tasks, alongside translated and original English datasets. It provides unrestricted access to diverse model variants, a compilation of diverse cross-lingual and multi-modal instruction data, and an evaluation benchmark with expert annotations.
Score: 73.11935193630823
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While the progression of Large Language Models (LLMs) has notably propelled financial analysis, their application has largely been confined to singular language realms, leaving untapped the potential of bilingual Chinese-English capacity. To bridge this chasm, we introduce ICE-PIXIU, seamlessly amalgamating the ICE-INTENT model and ICE-FLARE benchmark for bilingual financial analysis. ICE-PIXIU uniquely integrates a spectrum of Chinese tasks, alongside translated and original English datasets, enriching the breadth and depth of bilingual financial modeling. It provides unrestricted access to diverse model variants, a substantial compilation of diverse cross-lingual and multi-modal instruction data, and an evaluation benchmark with expert annotations, comprising 10 NLP tasks, 20 bilingual specific tasks, totaling 95k datasets. Our thorough evaluation emphasizes the advantages of incorporating these bilingual datasets, especially in translation tasks and utilizing original English data, enhancing both linguistic flexibility and analytical acuity in financial contexts. Notably, ICE-INTENT distinguishes itself by showcasing significant enhancements over conventional LLMs and existing financial LLMs in bilingual milieus, underscoring the profound impact of robust bilingual data on the accuracy and efficacy of financial NLP.

Related papers

MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation [89.73542209537148]
MultiFinBen is the first multilingual and multimodal benchmark tailored to the global financial domain.<n>We introduce two novel tasks, including EnglishOCR and SpanishOCR, the first OCR-embedded financial QA tasks.<n>We propose a dynamic, difficulty-aware selection mechanism and curate a compact, balanced benchmark.
arXiv Detail & Related papers (2025-06-16T22:01:49Z)
Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models [55.14276067678253]
This paper introduces a novel methodology for efficiently identifying inherent cross-lingual weaknesses in Large Language Models (LLMs)<n>We construct a new dataset of over 6,000 bilingual pairs across 16 languages using this methodology, demonstrating its effectiveness in revealing weaknesses even in state-of-the-art models.<n>Further experiments investigate the relationship between linguistic similarity and cross-lingual weaknesses, revealing that linguistically related languages share similar performance patterns.
arXiv Detail & Related papers (2025-05-24T12:31:27Z)
P-MMEval: A Parallel Multilingual Multitask Benchmark for Consistent Evaluation of LLMs [84.24644520272835]
We introduce P-MMEval, a large-scale benchmark covering effective fundamental and capability-specialized datasets.<n>P-MMEval delivers consistent language coverage across various datasets and provides parallel samples.<n>We conduct extensive experiments on representative multilingual model series to compare performances across models and tasks.
arXiv Detail & Related papers (2024-11-14T01:29:36Z)
Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating Financial Large Language Models [22.594428755214356]
"Golden Touchstone" is the first comprehensive bilingual benchmark for financial LLMs. benchmarks include a variety of financial tasks aimed at thoroughly assessing models' language understanding and generation capabilities. We open-sourced Touchstone-GPT, a financial LLM trained through continual pre-training and financial instruction tuning.
arXiv Detail & Related papers (2024-11-09T20:09:11Z)
Improving Bilingual Capabilities of Language Models to Support Diverse Linguistic Practices in Education [3.799331337558008]
Large language models (LLMs) offer promise in generating educational content, providing instructor feedback, and reducing teacher workload on assessments. We study the effectiveness of multilingual large language models (MLLMs) across monolingual (English-only, Spanish-only) and bilingual (Spanglish) student writing.
arXiv Detail & Related papers (2024-11-06T23:16:25Z)
Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages [55.36534539177367]
This paper introduces Pangea, a multilingual multimodal large language model (MLLM) trained on a diverse 6M instruction dataset spanning 39 languages. P Pangea significantly outperforms existing open-source models in multilingual settings and diverse cultural contexts. We fully open-source our data, code, and trained checkpoints, to facilitate the development of inclusive and robust multilingual MLLMs.
arXiv Detail & Related papers (2024-10-21T16:19:41Z)
Evaluating Knowledge-based Cross-lingual Inconsistency in Large Language Models [16.942897938964638]
Large Language Models (LLMs) have shown exceptional performance in various Natural Language Processing (NLP) tasks. Despite their successes, these models often exhibit significant inconsistencies when processing the same concepts across different languages. This study focuses on three primary questions: the existence of cross-lingual inconsistencies in LLMs, the specific aspects in which these inconsistencies manifest, and the correlation between cross-lingual consistency and multilingual capabilities of LLMs.
arXiv Detail & Related papers (2024-07-01T15:11:37Z)
A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias [5.104497013562654]
We present an overview of MLLMs, covering their evolution, key techniques, and multilingual capacities. We explore widely utilized multilingual corpora for MLLMs' training and multilingual datasets oriented for downstream tasks. We discuss bias on MLLMs including its category and evaluation metrics, and summarize the existing debiasing techniques.
arXiv Detail & Related papers (2024-04-01T05:13:56Z)
X-LLaVA: Optimizing Bilingual Large Vision-Language Alignment [4.571088742209442]
We create a 91K English-Korean-Chinese multilingual, multimodal training dataset. We develop a bilingual multimodal model that exhibits excellent performance in both Korean and English.
arXiv Detail & Related papers (2024-03-18T01:14:47Z)
D\'olares or Dollars? Unraveling the Bilingual Prowess of Financial LLMs Between Spanish and English [67.48541936784501]
Tois'on de Oro is the first framework that establishes instruction datasets, finetuned LLMs, and evaluation benchmark for financial LLMs in Spanish joint with English. We construct a rigorously curated bilingual instruction dataset including over 144K Spanish and English samples from 15 datasets covering 7 tasks. We evaluate our model and existing LLMs using FLARE-ES, the first comprehensive bilingual evaluation benchmark with 21 datasets covering 9 tasks.
arXiv Detail & Related papers (2024-02-12T04:50:31Z)
Multi-EuP: The Multilingual European Parliament Dataset for Analysis of Bias in Information Retrieval [62.82448161570428]
This dataset is designed to investigate fairness in a multilingual information retrieval context. It boasts an authentic multilingual corpus, featuring topics translated into all 24 languages. It offers rich demographic information associated with its documents, facilitating the study of demographic bias.
arXiv Detail & Related papers (2023-11-03T12:29:11Z)
Extrapolating Large Language Models to Non-English by Aligning Languages [109.09051737966178]
Existing large language models show disparate capability across different languages. In this paper, we empower pre-trained LLMs on non-English languages by building semantic alignment across languages.
arXiv Detail & Related papers (2023-08-09T13:32:06Z)
Adapters for Enhanced Modeling of Multilingual Knowledge and Text [54.02078328453149]
Language models have been extended to multilingual language models (MLLMs) Knowledge graphs contain facts in an explicit triple format, which require careful curation and are only available in a few high-resource languages. We propose to enhance MLLMs with knowledge from multilingual knowledge graphs (MLKGs) so as to tackle language and knowledge graph tasks across many languages.
arXiv Detail & Related papers (2022-10-24T21:33:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.