AI Developers Release Open-Source Implementations of ChatGPT Training Algorithm

Published on Jan 25, 2023

Researchers from LAION and CarperAI have released OpenAssistant and trlX, open-source implementations of reinforcement learning from human feedback (RLHF), the algorithm used to train ChatGPT. The algorithm has also been open-sourced by independent AI developer Phil Wang.

In order to make AI models, datasets, and code available to the general public, LAION, the Large-scale Artificial Intelligence Open Network, is a non-profit organization dedicated to machine learning research. LSET covered LAION’s release of LAION-5B in 2022, a dataset containing over five billion image-text pairs. The latest project of LAION is OpenAssistant, which is intended to provide all users with access to a great chat-based large language model. OpenAssistant will be implemented in its MVP form using OpenAI’s InstructGPT paper: a dataset of human-generated instructions, a dataset of machine-generated responses and their human rankings, as well as RLHF implementations.

Previously, LSET covered EleutherAI’s development of the open-source language model GPT-NeoX, developed by CarperAI within the EleutherAI research group. The lab announced in October 2022 that it would train and publicly release “instruction-tuned” models using RLHF. Several organizations have collaborated on the project, including HuggingFace, Scale, and Humanloop. A framework for fine-tuning HuggingFace language models using RLHF was open-sourced by CarperAI as part of this project.

In his presentation, Phil Wang, an AI developer well known for open-source implementations of deep learning research models such as Imagen and Make-A-Video, discussed his work-in-progress implementation of RLHF for the PaLM language model, called PaLM + RLHF. Wang emphasizes that there is no pre-trained model, only a framework that can be used to train users’ own models. Additionally, he recommends that users interested in replicating ChatGPT join the LAION Discord channel.

Although these open-source projects implement ChatGPT’s training methods, they do not currently have trained models available. According to Wang’s project FAQ, training might require “millions of dollars of computing power and data”. In LAION’s roadmap for OpenAssistant, efforts are listed to collect data and train models, however it is unclear when trained models will be available.

The AI community has discussed these efforts on social media. According to HuggingFace’s CTO Julien Chaumond, there will be ten open replications of ChatGPT within six months.

