Opening up ChatGPT: Tracking openness, transparency, and accountability
in instruction-tuned text generators
- URL: http://arxiv.org/abs/2307.05532v1
- Date: Sat, 8 Jul 2023 07:08:20 GMT
- Title: Opening up ChatGPT: Tracking openness, transparency, and accountability
in instruction-tuned text generators
- Authors: Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- Abstract summary: We evaluate projects in terms of openness of code, training data, model weights, RLHF data, licensing, scientific documentation, and access methods.
We find that while there is a fast-growing list of projects billing themselves as 'open source', many inherit undocumented data of dubious legality.
Degrees of openness are relevant to fairness and accountability at all points.
- Score: 0.11470070927586018
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models that exhibit instruction-following behaviour represent
one of the biggest recent upheavals in conversational interfaces, a trend in
large part fuelled by the release of OpenAI's ChatGPT, a proprietary large
language model for text generation fine-tuned through reinforcement learning
from human feedback (LLM+RLHF). We review the risks of relying on proprietary
software and survey the first crop of open-source projects of comparable
architecture and functionality. The main contribution of this paper is to show
that openness is differentiated, and to offer scientific documentation of
degrees of openness in this fast-moving field. We evaluate projects in terms of
openness of code, training data, model weights, RLHF data, licensing,
scientific documentation, and access methods. We find that while there is a
fast-growing list of projects billing themselves as 'open source', many inherit
undocumented data of dubious legality, few share the all-important
instruction-tuning (a key site where human annotation labour is involved), and
careful scientific documentation is exceedingly rare. Degrees of openness are
relevant to fairness and accountability at all points, from data collection and
curation to model architecture, and from training and fine-tuning to release
and deployment.
Related papers
- OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models [70.72097493954067]
Large language models (LLMs) for code have become indispensable in various domains, including code generation, reasoning, tasks and agent systems.
We introduce OpenCoder, a top-tier code LLM that not only achieves performance comparable to leading models but also serves as an open cookbook'' for the research community.
arXiv Detail & Related papers (2024-11-07T17:47:25Z) - OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models [61.14336781917986]
We introduce OpenR, an open-source framework for enhancing the reasoning capabilities of large language models (LLMs)
OpenR unifies data acquisition, reinforcement learning training, and non-autoregressive decoding into a cohesive software platform.
Our work is the first to provide an open-source framework that explores the core techniques of OpenAI's o1 model with reinforcement learning.
arXiv Detail & Related papers (2024-10-12T23:42:16Z) - In-Context Code-Text Learning for Bimodal Software Engineering [26.0027882745058]
Bimodal software analysis initially appeared to be within reach with the advent of large language models.
We postulate that in-context learning for the code-text bimodality is a promising avenue.
We consider a diverse dataset encompassing 23 software engineering tasks, which we transform in an in-context learning format.
arXiv Detail & Related papers (2024-10-08T19:42:00Z) - Building and better understanding vision-language models: insights and future directions [8.230565679484128]
This paper provides a comprehensive overview of the current state-of-the-art approaches to vision-language models.
We then walk through the practical steps to build Idefics3-8B, a powerful VLM that significantly outperforms its predecessor Idefics2-8B.
We release the model along with the datasets created for its training.
arXiv Detail & Related papers (2024-08-22T17:47:24Z) - Query of CC: Unearthing Large Scale Domain-Specific Knowledge from
Public Corpora [104.16648246740543]
We propose an efficient data collection method based on large language models.
The method bootstraps seed information through a large language model and retrieves related data from public corpora.
It not only collects knowledge-related data for specific domains but unearths the data with potential reasoning procedures.
arXiv Detail & Related papers (2024-01-26T03:38:23Z) - SoTaNa: The Open-Source Software Development Assistant [81.86136560157266]
SoTaNa is an open-source software development assistant.
It generates high-quality instruction-based data for the domain of software engineering.
It employs a parameter-efficient fine-tuning approach to enhance the open-source foundation model, LLaMA.
arXiv Detail & Related papers (2023-08-25T14:56:21Z) - Generate rather than Retrieve: Large Language Models are Strong Context
Generators [74.87021992611672]
We present a novel perspective for solving knowledge-intensive tasks by replacing document retrievers with large language model generators.
We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer.
arXiv Detail & Related papers (2022-09-21T01:30:59Z) - A Survey on Open Information Extraction from Rule-based Model to Large Language Model [29.017823043117144]
Open Information Extraction (OpenIE) represents a crucial NLP task aimed at deriving structured information from unstructured text.
This survey paper provides an overview of OpenIE technologies spanning from 2007 to 2024, emphasizing a chronological perspective.
The paper categorizes OpenIE approaches into rule-based, neural, and pre-trained large language models, discussing each within a chronological framework.
arXiv Detail & Related papers (2022-08-18T08:03:45Z) - ENT-DESC: Entity Description Generation by Exploring Knowledge Graph [53.03778194567752]
In practice, the input knowledge could be more than enough, since the output description may only cover the most significant knowledge.
We introduce a large-scale and challenging dataset to facilitate the study of such a practical scenario in KG-to-text.
We propose a multi-graph structure that is able to represent the original graph information more comprehensively.
arXiv Detail & Related papers (2020-04-30T14:16:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.