TigerBot: An Open Multilingual Multitask LLM
- URL: http://arxiv.org/abs/2312.08688v2
- Date: Fri, 15 Dec 2023 01:42:20 GMT
- Title: TigerBot: An Open Multilingual Multitask LLM
- Authors: Ye Chen and Wei Cai and Liangmin Wu and Xiaowei Li and Zhanxuan Xin
and Cong Fu
- Abstract summary: We release and introduce the TigerBot family of large language models (LLMs)
We develop our models embarking from Llama-2 and BLOOM, and push the boundary further in data, training algorithm, infrastructure, and application tools.
TigerBot model family also achieves leading performance in major academic and industrial benchmarks and leaderboards.
- Score: 7.413477227090228
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We release and introduce the TigerBot family of large language models (LLMs),
consisting of base and chat models, sized from 7, 13, 70 and 180 billion
parameters. We develop our models embarking from Llama-2 and BLOOM, and push
the boundary further in data, training algorithm, infrastructure, and
application tools. Our models yield meaningful performance gain over SOTA
open-source models, e.g., Llama-2, specifically 6% gain in English and 20% gain
in Chinese. TigerBot model family also achieves leading performance in major
academic and industrial benchmarks and leaderboards. We believe that TigerBot
represents just a snapshot of lightning-fast progression in LLM open-source
community. Therefore, we are thrilled to give back by publicly releasing our
models and reporting our approach behind, with additional emphases on building
SOTA LLMs in a democratized way and making LLMs of use in real-world
applications.
Related papers
- MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series [86.31735321970481]
We open-source MAP-Neo, a bilingual language model with 7B parameters trained from scratch on 4.5T high-quality tokens.
Our MAP-Neo is the first fully open-sourced bilingual LLM with comparable performance compared to existing state-of-the-art LLMs.
arXiv Detail & Related papers (2024-05-29T17:57:16Z) - Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R [1.9799527196428242]
Large Langauge Models (LLMs) have gained a lot of attention in the Software Engineering (SE) community.
In this work, we empirically study PEFT methods, LoRA and Compacter, on CodeT5 and CodeLlama.
We will assess their performance compared to fully fine-tuned models, whether they can be used for knowledge transfer from natural language models to code, and their ability to adapt the learned knowledge to an unseen language.
arXiv Detail & Related papers (2024-03-16T03:12:45Z) - YAYI 2: Multilingual Open-Source Large Language Models [53.92832054643197]
We propose YAYI 2, including both base and chat models, with 30 billion parameters.
YAYI 2 is pre-trained from scratch on a multilingual corpus which contains 2.65 trillion tokens filtered by our pre-training data processing pipeline.
The base model is aligned with human values through supervised fine-tuning with millions of instructions and reinforcement learning from human feedback.
arXiv Detail & Related papers (2023-12-22T17:34:47Z) - FinGPT: Large Generative Models for a Small Language [48.46240937758779]
We create large language models (LLMs) for Finnish, a language spoken by less than 0.1% of the world population.
We train seven monolingual models from scratch (186M to 13B parameters) dubbed FinGPT.
We continue the pretraining of the multilingual BLOOM model on a mix of its original training data and Finnish, resulting in a 176 billion parameter model we call BLUUMI.
arXiv Detail & Related papers (2023-11-03T08:05:04Z) - Baichuan 2: Open Large-scale Language Models [51.56361715162972]
We present Baichuan 2, a series of large-scale multilingual language models containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion tokens.
Baichuan 2 matches or outperforms other open-source models of similar size on public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval.
arXiv Detail & Related papers (2023-09-19T04:13:22Z) - Okapi: Instruction-tuned Large Language Models in Multiple Languages
with Reinforcement Learning from Human Feedback [61.83548032416181]
We present Okapi, the first system with instruction-tuned LLMs based on RLHF for multiple languages.
Okapi introduces instruction and response-ranked data in 26 diverse languages to facilitate the experiments and development of future multilingual LLM research.
arXiv Detail & Related papers (2023-07-29T18:01:46Z) - Llama 2: Open Foundation and Fine-Tuned Chat Models [65.43397761706336]
We develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs)
Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases.
arXiv Detail & Related papers (2023-07-18T14:31:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.