XGen-7B Technical Report
- URL: http://arxiv.org/abs/2309.03450v1
- Date: Thu, 7 Sep 2023 02:20:03 GMT
- Title: XGen-7B Technical Report
- Authors: Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen
Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil
Purushwalkam, Tong Niu, Wojciech Kry\'sci\'nski, Lidiya Murakhovs'ka,
Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat,
Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Joty, Caiming Xiong
- Abstract summary: XGen is a series of 7B parameter models on up to 8K sequence length for up to 1.5T tokens.
We open-source our models for both research advancements and commercial applications.
- Score: 138.71625147048377
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have become ubiquitous across various domains,
transforming the way we interact with information and conduct research.
However, most high-performing LLMs remain confined behind proprietary walls,
hindering scientific progress. Most open-source LLMs, on the other hand, are
limited in their ability to support longer sequence lengths, which is a key
requirement for many tasks that require inference over an input context. To
address this, we have trained XGen, a series of 7B parameter models on up to 8K
sequence length for up to 1.5T tokens. We have also finetuned the XGen models
on public-domain instructional data, creating their instruction-tuned
counterparts (XGen-Inst). We open-source our models for both research
advancements and commercial applications. Our evaluation on standard benchmarks
shows that XGen models achieve comparable or better results when compared with
state-of-the-art open-source LLMs. Our targeted evaluation on long sequence
modeling tasks shows the benefits of our 8K-sequence models over 2K-sequence
open-source LLMs.
Related papers
- InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model [80.93387166769679]
We present IXC-2.5-Reward, a simple yet effective multi-modal reward model that aligns Large Vision Language Models with human preferences.
IXC-2.5-Reward achieves excellent results on the latest multi-modal reward model benchmark and shows competitive performance on text-only reward model benchmarks.
arXiv Detail & Related papers (2025-01-21T18:47:32Z) - xLAM: A Family of Large Action Models to Empower AI Agent Systems [111.5719694445345]
We release xLAM, a series of large action models designed for AI agent tasks.
xLAM consistently delivers exceptional performance across multiple agent ability benchmarks.
arXiv Detail & Related papers (2024-09-05T03:22:22Z) - xGen-MM (BLIP-3): A Family of Open Large Multimodal Models [157.44696790158784]
This report introduces xGen-MM, a framework for developing Large Multimodal Models (LMMs)
The framework comprises meticulously curated datasets, a training recipe, model architectures, and a resulting suite of LMMs.
Our models undergo rigorous evaluation across a range of tasks, including both single and multi-image benchmarks.
arXiv Detail & Related papers (2024-08-16T17:57:01Z) - Prompt2Model: Generating Deployable Models from Natural Language
Instructions [74.19816829003729]
Large language models (LLMs) enable system builders to create competent NLP systems through prompting.
In other ways, LLMs are a step backward from traditional special-purpose NLP models.
We propose Prompt2Model, a general-purpose method that takes a natural language task description like the prompts provided to LLMs.
arXiv Detail & Related papers (2023-08-23T17:28:21Z) - Augmenting Interpretable Models with LLMs during Training [73.40079895413861]
We propose Augmented Interpretable Models (Aug-imodels) to build efficient and interpretable models.
Aug-imodels use LLMs during fitting but not during inference, allowing complete transparency.
We explore two instantiations of Aug-imodels in natural-language processing: (i) Aug-GAM, which augments a generalized additive model with decoupled embeddings from an LLM and (ii) Aug-Tree, which augments a decision tree with LLM feature expansions.
arXiv Detail & Related papers (2022-09-23T18:36:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.