LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
- URL: http://arxiv.org/abs/2407.03963v2
- Date: Mon, 30 Dec 2024 07:46:43 GMT
- Title: LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
- Authors: LLM-jp, :, Akiko Aizawa, Eiji Aramaki, Bowen Chen, Fei Cheng, Hiroyuki Deguchi, Rintaro Enomoto, Kazuki Fujii, Kensuke Fukumoto, Takuya Fukushima, Namgi Han, Yuto Harada, Chikara Hashimoto, Tatsuya Hiraoka, Shohei Hisada, Sosuke Hosokawa, Lu Jie, Keisuke Kamata, Teruhito Kanazawa, Hiroki Kanezashi, Hiroshi Kataoka, Satoru Katsumata, Daisuke Kawahara, Seiya Kawano, Atsushi Keyaki, Keisuke Kiryu, Hirokazu Kiyomaru, Takashi Kodama, Takahiro Kubo, Yohei Kuga, Ryoma Kumon, Shuhei Kurita, Sadao Kurohashi, Conglong Li, Taiki Maekawa, Hiroshi Matsuda, Yusuke Miyao, Kentaro Mizuki, Sakae Mizuki, Yugo Murawaki, Akim Mousterou, Ryo Nakamura, Taishi Nakamura, Kouta Nakayama, Tomoka Nakazato, Takuro Niitsuma, Jiro Nishitoba, Yusuke Oda, Hayato Ogawa, Takumi Okamoto, Naoaki Okazaki, Yohei Oseki, Shintaro Ozaki, Koki Ryu, Rafal Rzepka, Keisuke Sakaguchi, Shota Sasaki, Satoshi Sekine, Kohei Suda, Saku Sugawara, Issa Sugiura, Hiroaki Sugiyama, Hisami Suzuki, Jun Suzuki, Toyotaro Suzumura, Kensuke Tachibana, Yu Takagi, Kyosuke Takami, Koichi Takeda, Masashi Takeshita, Masahiro Tanaka, Kenjiro Taura, Arseny Tolmachev, Nobuhiro Ueda, Zhen Wan, Shuntaro Yada, Sakiko Yahata, Yuya Yamamoto, Yusuke Yamauchi, Hitomi Yanaka, Rio Yokota, Koichiro Yoshino,
- Abstract summary: This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs)<n>As of this writing, more than 1,500 participants from academia and industry are working together for this purpose.<n>For the latest activities, visit https://llm-jp.nii.ac.jp/en/.
- Score: 90.04787972295222
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its activities, and technical reports on the LLMs developed by LLM-jp. For the latest activities, visit https://llm-jp.nii.ac.jp/en/.
Related papers
- LLM-JEPA: Large Language Models Meet Joint Embedding Predictive Architectures [50.494504099850325]
Large Language Model (LLM) pretraining, finetuning, and evaluation rely on input-space reconstruction and generative capabilities.<n>Yet, it has been observed in vision that embedding-space training objectives, e.g., with Joint Embedding Predictive Architectures (JEPAs), are far superior to their input-space counterpart.
arXiv Detail & Related papers (2025-09-11T03:03:57Z) - Renaissance of Literate Programming in the Era of LLMs: Enhancing LLM-Based Code Generation in Large-Scale Projects [7.927743991760644]
Large Language Models (LLMs) have helped programmers increase efficiency through code generation, comprehension, and repair.
Their application to large-scale projects remains challenging due to complex interdependencies and the extensive size of moderns.
In this study, we introduce the idea of Interoperable LP (ILP), which leverages literate programming principles to enhance the development of both small-scale documents and large-scale projects with LLMs.
arXiv Detail & Related papers (2024-12-25T12:02:46Z) - Open Llama2 Model for the Lithuanian Language [0.0]
We propose and describe the first open Llama2 large language models (LLMs) for the Lithuanian language.
We provide a brief review of open regional LLMs and detailed information on the proposed LLMs and their training process.
arXiv Detail & Related papers (2024-08-23T10:18:39Z) - YuLan: An Open-source Large Language Model [179.59272970659677]
This paper presents the development of YuLan, a series of open-source large language models (LLMs) with $12$ billion parameters.
The base model of YuLan is pre-trained on approximately $1.7$T tokens derived from a diverse corpus, including massive English, Chinese, and multilingual texts.
We devise a curriculum-learning framework throughout across these stages, which helps LLMs learn knowledge in an easy-to-hard manner.
arXiv Detail & Related papers (2024-06-28T11:52:53Z) - A Survey Study on the State of the Art of Programming Exercise Generation using Large Language Models [0.0]
This paper analyzes Large Language Models (LLMs) with regard to their programming exercise generation capabilities.
Through a survey study, we defined the state of the art, extracted their strengths and weaknesses and proposed an evaluation matrix.
arXiv Detail & Related papers (2024-05-30T15:49:34Z) - MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series [86.31735321970481]
We open-source MAP-Neo, a bilingual language model with 7B parameters trained from scratch on 4.5T high-quality tokens.
Our MAP-Neo is the first fully open-sourced bilingual LLM with comparable performance compared to existing state-of-the-art LLMs.
arXiv Detail & Related papers (2024-05-29T17:57:16Z) - Large Language Models: A Survey [69.72787936480394]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks.
LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z) - Integration of Large Language Models and Federated Learning [58.9876604258949]
We propose a research framework, dividing the fusion of LLMs and FL into three parts.
We first provide a review of the current state of research in the domain of LLMs combined with FL, including their typical applications.
We then discuss the practical applications of the combination of LLMs and FL in critical scenarios such as healthcare, finance, and education.
arXiv Detail & Related papers (2023-07-18T02:09:14Z) - A Comprehensive Overview of Large Language Models [68.22178313875618]
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks.
This article provides an overview of the existing literature on a broad range of LLM-related concepts.
arXiv Detail & Related papers (2023-07-12T20:01:52Z) - llm-japanese-dataset v0: Construction of Japanese Chat Dataset for Large
Language Models and its Methodology [4.396516562723691]
This study constructed a Japanese chat dataset for tuning large language models (LLMs), which consist of about 8.4 million records.
The results suggest that our dataset is possibly beneficial for LLMs.
However, we also revealed some difficulties in constructing LLMs in languages other than English.
arXiv Detail & Related papers (2023-05-22T04:59:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.