Related papers: H2O Open Ecosystem for State-of-the-art Large Language Models

H2O Open Ecosystem for State-of-the-art Large Language Models

URL: http://arxiv.org/abs/2310.13012v2
Date: Mon, 23 Oct 2023 10:24:55 GMT
Title: H2O Open Ecosystem for State-of-the-art Large Language Models
Authors: Arno Candel, Jon McKinney, Philipp Singer, Pascal Pfeiffer, Maximilian Jeblick, Chun Ming Lee, Marcos V. Conde
Abstract summary: Large Language Models (LLMs) represent a revolution in AI. They also pose many significant risks, such as the presence of biased, private, copyrighted or harmful text. We introduce a complete open-source ecosystem for developing and testing LLMs.
Score: 10.04351591653126
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large Language Models (LLMs) represent a revolution in AI. However, they also pose many significant risks, such as the presence of biased, private, copyrighted or harmful text. For this reason we need open, transparent and safe solutions. We introduce a complete open-source ecosystem for developing and testing LLMs. The goal of this project is to boost open alternatives to closed-source approaches. We release h2oGPT, a family of fine-tuned LLMs of diverse sizes. We also introduce H2O LLM Studio, a framework and no-code GUI designed for efficient fine-tuning, evaluation, and deployment of LLMs using the most recent state-of-the-art techniques. Our code and models are fully open-source. We believe this work helps to boost AI development and make it more accessible, efficient and trustworthy. The demo is available at: https://gpt.h2o.ai/

Related papers

WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents [55.64361927346957]
We propose a training-free "world alignment" that learns an environment's symbolic knowledge complementary to large language models (LLMs) We also propose an RL-free, model-based agent "WALL-E 2.0" through the model-predictive control framework. WALL-E 2.0 significantly outperforms existing methods on open-world challenges in Mars (Minecraft like) and ALFWorld (embodied indoor environments)
arXiv Detail & Related papers (2025-04-22T10:58:27Z)
Comprehensive Analysis of Transparency and Accessibility of ChatGPT, DeepSeek, And other SoTA Large Language Models [2.6900047294457683]
Despite increasing discussions on open-source Artificial Intelligence (AI), existing research lacks a discussion on the transparency and accessibility of state-of-the-art (SoTA) Large Language Models (LLMs) This study critically analyzes SoTA LLMs from the last five years, including ChatGPT, DeepSeek, LLaMA, and others, to assess their adherence to transparency standards and the implications of partial openness. Our findings reveal that while some models are labeled as open-source, this does not necessarily mean they are fully open-sourced.
arXiv Detail & Related papers (2025-02-21T23:53:13Z)
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series [86.31735321970481]
We open-source MAP-Neo, a bilingual language model with 7B parameters trained from scratch on 4.5T high-quality tokens. Our MAP-Neo is the first fully open-sourced bilingual LLM with comparable performance compared to existing state-of-the-art LLMs.
arXiv Detail & Related papers (2024-05-29T17:57:16Z)
YAYI 2: Multilingual Open-Source Large Language Models [53.92832054643197]
We propose YAYI 2, including both base and chat models, with 30 billion parameters. YAYI 2 is pre-trained from scratch on a multilingual corpus which contains 2.65 trillion tokens filtered by our pre-training data processing pipeline. The base model is aligned with human values through supervised fine-tuning with millions of instructions and reinforcement learning from human feedback.
arXiv Detail & Related papers (2023-12-22T17:34:47Z)
TigerBot: An Open Multilingual Multitask LLM [7.413477227090228]
We release and introduce the TigerBot family of large language models (LLMs) We develop our models embarking from Llama-2 and BLOOM, and push the boundary further in data, training algorithm, infrastructure, and application tools. TigerBot model family also achieves leading performance in major academic and industrial benchmarks and leaderboards.
arXiv Detail & Related papers (2023-12-14T07:05:42Z)
LLM360: Towards Fully Transparent Open-Source LLMs [89.05970416013403]
The goal of LLM360 is to support open and collaborative AI research by making the end-to-end training process transparent and reproducible by everyone. As a first step of LLM360, we release two 7B parameter LLMs pre-trained from scratch, Amber and CrystalCoder, including their training code, data, intermediate checkpoints, and analyses.
arXiv Detail & Related papers (2023-12-11T17:39:00Z)
A Wolf in Sheep's Clothing: Generalized Nested Jailbreak Prompts can Fool Large Language Models Easily [51.63085197162279]
Large Language Models (LLMs) are designed to provide useful and safe responses. adversarial prompts known as 'jailbreaks' can circumvent safeguards. We propose ReNeLLM, an automatic framework that leverages LLMs themselves to generate effective jailbreak prompts.
arXiv Detail & Related papers (2023-11-14T16:02:16Z)
On the Safety of Open-Sourced Large Language Models: Does Alignment Really Prevent Them From Being Misused? [49.99955642001019]
We show that open-sourced, aligned large language models could be easily misguided to generate undesired content. Our key idea is to directly manipulate the generation process of open-sourced LLMs to misguide it to generate undesired content.
arXiv Detail & Related papers (2023-10-02T19:22:01Z)
h2oGPT: Democratizing Large Language Models [1.8043055303852882]
We introduce h2oGPT, a suite of open-source code repositories for the creation and use of Large Language Models. The goal of this project is to create the world's best truly open-source alternative to closed-source approaches.
arXiv Detail & Related papers (2023-06-13T22:19:53Z)
A Systematic Evaluation of Large Language Models of Code [88.34057460577957]
Large language models (LMs) of code have recently shown tremendous promise in completing code and synthesizing code from natural language descriptions. The current state-of-the-art code LMs are not publicly available, leaving many questions about their model and data design decisions. Although Codex is not open-source, we find that existing open-source models do achieve close results in some programming languages. We release a new model, PolyCoder, with 2.7B parameters based on the GPT-2 architecture, which was trained on 249GB of code across 12 programming languages on a single machine.
arXiv Detail & Related papers (2022-02-26T15:53:55Z)
Can OpenAI Codex and Other Large Language Models Help Us Fix Security Bugs? [8.285068188878578]
We examine the use of large language models (LLMs) for code repair. We investigate challenges in the design of prompts that coax LLMs into generating repaired versions of insecure code. Experiments show that LLMs could collectively repair 100% of our synthetically generated and hand-crafted scenarios.
arXiv Detail & Related papers (2021-12-03T19:15:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.