ChatGPT's One-year Anniversary: Are Open-Source Large Language Models
Catching up?
- URL: http://arxiv.org/abs/2311.16989v4
- Date: Mon, 15 Jan 2024 09:55:05 GMT
- Title: ChatGPT's One-year Anniversary: Are Open-Source Large Language Models
Catching up?
- Authors: Hailin Chen, Fangkai Jiao, Xingxuan Li, Chengwei Qin, Mathieu Ravaut,
Ruochen Zhao, Caiming Xiong, Shafiq Joty
- Abstract summary: ChatGPT has brought a seismic shift in the entire landscape of AI.
It showed that a model could answer human questions and follow instructions on a broad panel of tasks.
While closed-source LLMs generally outperform their open-source counterparts, the progress on the latter has been rapid.
This has crucial implications not only on research but also on business.
- Score: 71.12709925152784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Upon its release in late 2022, ChatGPT has brought a seismic shift in the
entire landscape of AI, both in research and commerce. Through
instruction-tuning a large language model (LLM) with supervised fine-tuning and
reinforcement learning from human feedback, it showed that a model could answer
human questions and follow instructions on a broad panel of tasks. Following
this success, interests in LLMs have intensified, with new LLMs flourishing at
frequent interval across academia and industry, including many start-ups
focused on LLMs. While closed-source LLMs (e.g., OpenAI's GPT, Anthropic's
Claude) generally outperform their open-source counterparts, the progress on
the latter has been rapid with claims of achieving parity or even better on
certain tasks. This has crucial implications not only on research but also on
business. In this work, on the first anniversary of ChatGPT, we provide an
exhaustive overview of this success, surveying all tasks where an open-source
LLM has claimed to be on par or better than ChatGPT.
Related papers
- LLMs are Imperfect, Then What? An Empirical Study on LLM Failures in Software Engineering [38.20696656193963]
We conducted an observational study with 22 participants using ChatGPT as a coding assistant in a non-trivial software engineering task.
We identified the cases where ChatGPT failed, their root causes, and the corresponding mitigation solutions used by users.
arXiv Detail & Related papers (2024-11-15T03:29:41Z) - MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series [86.31735321970481]
We open-source MAP-Neo, a bilingual language model with 7B parameters trained from scratch on 4.5T high-quality tokens.
Our MAP-Neo is the first fully open-sourced bilingual LLM with comparable performance compared to existing state-of-the-art LLMs.
arXiv Detail & Related papers (2024-05-29T17:57:16Z) - Large Language Models: A Survey [69.72787936480394]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks.
LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - LLM360: Towards Fully Transparent Open-Source LLMs [89.05970416013403]
The goal of LLM360 is to support open and collaborative AI research by making the end-to-end training process transparent and reproducible by everyone.
As a first step of LLM360, we release two 7B parameter LLMs pre-trained from scratch, Amber and CrystalCoder, including their training code, data, intermediate checkpoints, and analyses.
arXiv Detail & Related papers (2023-12-11T17:39:00Z) - An Empirical Study of Instruction-tuning Large Language Models in
Chinese [32.5288378307064]
This paper makes an in-depth empirical study of instruction-tuning LLMs in Chinese, which can serve as a cookbook.
Specifically, we systematically explore the impact of LLM bases, parameter-efficient methods, instruction data types.
We also conduct experiment to study the impact of other factors, e.g., chain-of-thought data and human-value alignment.
arXiv Detail & Related papers (2023-10-11T09:18:09Z) - A Survey of GPT-3 Family Large Language Models Including ChatGPT and
GPT-4 [4.206175795966694]
Large language models (LLMs) are a special class of pretrained language models obtained by scaling model size, pretraining corpus and computation.
We refer to GPT-3 and its successor OpenAI models, including ChatGPT and GPT4, as GPT-3 family large language models (GLLMs)
arXiv Detail & Related papers (2023-10-04T16:37:05Z) - Investigating Answerability of LLMs for Long-Form Question Answering [35.41413072729483]
We focus on long-form question answering (LFQA) because it has several practical and impactful applications.
We propose a question-generation method from abstractive summaries and show that generating follow-up questions from summaries of long documents can create a challenging setting.
arXiv Detail & Related papers (2023-09-15T07:22:56Z) - Check Your Facts and Try Again: Improving Large Language Models with
External Knowledge and Automated Feedback [127.75419038610455]
Large language models (LLMs) are able to generate human-like, fluent responses for many downstream tasks.
This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules.
arXiv Detail & Related papers (2023-02-24T18:48:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.