Llama 2: Open Foundation and Fine-Tuned Chat Models
- URL: http://arxiv.org/abs/2307.09288v2
- Date: Wed, 19 Jul 2023 17:08:59 GMT
- Title: Llama 2: Open Foundation and Fine-Tuned Chat Models
- Authors: Hugo Touvron and Louis Martin and Kevin Stone and Peter Albert and
Amjad Almahairi and Yasmine Babaei and Nikolay Bashlykov and Soumya Batra and
Prajjwal Bhargava and Shruti Bhosale and Dan Bikel and Lukas Blecher and
Cristian Canton Ferrer and Moya Chen and Guillem Cucurull and David Esiobu
and Jude Fernandes and Jeremy Fu and Wenyin Fu and Brian Fuller and Cynthia
Gao and Vedanuj Goswami and Naman Goyal and Anthony Hartshorn and Saghar
Hosseini and Rui Hou and Hakan Inan and Marcin Kardas and Viktor Kerkez and
Madian Khabsa and Isabel Kloumann and Artem Korenev and Punit Singh Koura and
Marie-Anne Lachaux and Thibaut Lavril and Jenya Lee and Diana Liskovich and
Yinghai Lu and Yuning Mao and Xavier Martinet and Todor Mihaylov and Pushkar
Mishra and Igor Molybog and Yixin Nie and Andrew Poulton and Jeremy
Reizenstein and Rashi Rungta and Kalyan Saladi and Alan Schelten and Ruan
Silva and Eric Michael Smith and Ranjan Subramanian and Xiaoqing Ellen Tan
and Binh Tang and Ross Taylor and Adina Williams and Jian Xiang Kuan and
Puxin Xu and Zheng Yan and Iliyan Zarov and Yuchen Zhang and Angela Fan and
Melanie Kambadur and Sharan Narang and Aurelien Rodriguez and Robert Stojnic
and Sergey Edunov and Thomas Scialom
- Abstract summary: We develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs)
Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases.
- Score: 65.43397761706336
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we develop and release Llama 2, a collection of pretrained and
fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70
billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for
dialogue use cases. Our models outperform open-source chat models on most
benchmarks we tested, and based on our human evaluations for helpfulness and
safety, may be a suitable substitute for closed-source models. We provide a
detailed description of our approach to fine-tuning and safety improvements of
Llama 2-Chat in order to enable the community to build on our work and
contribute to the responsible development of LLMs.
Related papers
- Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization [65.64108848398696]
We introduce a preference optimization process to enhance the multimodal reasoning capabilities of MLLMs.
We develop a simple yet effective method, termed Mixed Preference Optimization (MPO), which boosts multimodal CoT performance.
Our model, InternVL2-8B-MPO, achieves an accuracy of 67.0 on MathVista, outperforming InternVL2-8B by 8.7 points and achieving performance comparable to the 10x larger InternVL2-76B.
arXiv Detail & Related papers (2024-11-15T18:59:27Z) - FuseChat: Knowledge Fusion of Chat Models [35.90957231731829]
We propose a new framework for the knowledge fusion of chat LLMs through two main stages, resulting in FuseChat.
We implement and validate FuseChat using six prominent chat LLMs with diverse architectures and scales, including OpenChat-3.5-7B, Starling-LM-7B-alpha, NH2-SOLAR-10.7B, InternLM2-Chat-20B, Mixtral-8x7B-Instruct, and Qwen-1.5-Chat-72B.
arXiv Detail & Related papers (2024-08-15T07:37:24Z) - Retrieval-augmented code completion for local projects using large language models [0.0]
We focus on using large language models (LLMs) with around 160 million parameters that are suitable for local execution and augmentation with retrieval from local projects.
We train two models based on the transformer architecture, the generative model GPT-2 and the retrieval-adapted RETRO model, on open-source Python files.
We improve our models' performance with In-context retrieval-augmented generation, which retrieves code snippets based on the Jaccard similarity of tokens.
arXiv Detail & Related papers (2024-08-09T12:26:57Z) - Gemma: Open Models Based on Gemini Research and Technology [128.57714343844074]
This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models.
Gemma models demonstrate strong performance across academic benchmarks for language understanding, reasoning, and safety.
arXiv Detail & Related papers (2024-03-13T06:59:16Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - YAYI 2: Multilingual Open-Source Large Language Models [53.92832054643197]
We propose YAYI 2, including both base and chat models, with 30 billion parameters.
YAYI 2 is pre-trained from scratch on a multilingual corpus which contains 2.65 trillion tokens filtered by our pre-training data processing pipeline.
The base model is aligned with human values through supervised fine-tuning with millions of instructions and reinforcement learning from human feedback.
arXiv Detail & Related papers (2023-12-22T17:34:47Z) - TigerBot: An Open Multilingual Multitask LLM [7.413477227090228]
We release and introduce the TigerBot family of large language models (LLMs)
We develop our models embarking from Llama-2 and BLOOM, and push the boundary further in data, training algorithm, infrastructure, and application tools.
TigerBot model family also achieves leading performance in major academic and industrial benchmarks and leaderboards.
arXiv Detail & Related papers (2023-12-14T07:05:42Z) - Unlocking the Potential of User Feedback: Leveraging Large Language
Model as User Simulator to Enhance Dialogue System [65.93577256431125]
We propose an alternative approach called User-Guided Response Optimization (UGRO) to combine it with a smaller task-oriented dialogue model.
This approach uses LLM as annotation-free user simulator to assess dialogue responses, combining them with smaller fine-tuned end-to-end TOD models.
Our approach outperforms previous state-of-the-art (SOTA) results.
arXiv Detail & Related papers (2023-06-16T13:04:56Z) - Chain-of-Thought Hub: A Continuous Effort to Measure Large Language
Models' Reasoning Performance [35.38549845444575]
Chain-of-Thought Hub is an open-source evaluation suite on the multi-step reasoning capabilities of large language models.
This work proposes Chain-of-Thought Hub, an open-source evaluation suite on the multi-step reasoning capabilities of large language models.
arXiv Detail & Related papers (2023-05-26T23:46:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.