The first open source equivalent of OpenAI’s ChatGPT has arrived, but good luck running it on your laptop — or at all.
This week, the programmer who reverse-engineered Meta's closed-sourced AI systems like Make-A-Video has outed PaLM + RLHF. It is a text generator; it acts like ChatGPT. The system basically marries a large Google language model called PaLM with a technique called Reinforcement Learning with Human Feedbackâ€"RLHFâ€"for short, to create a system that can accomplish pretty much any task ChatGPT canâ€"drafting emails, suggesting code on the computer.
But PaLM + RLHF is not pre-trained. That is, the system has not been trained on the example data from the web to actually make it work. Downloading PaLM + RLHF won't suddenly install some magic ChatGPT-like experience-the process would involve compiling gigabytes of text from which the model could learn and finding hardware heavy-duty enough to handle the training load.
Like ChatGPT, PaLM + RLHF is basically a statistical word predictor. This entity learns to predict words by ingesting an almost unrivaled number of examples from training data — for instance, posts from Reddit, news articles, and e-books — how many times words are likely to appear given some general patterns of surrounding text like semantic context.
ChatGPT and PaLM + RLHF share a special sauce in the application of Reinforcement Learning with Human Feedback, the technique of adapting language models more closely to what users want them to do. The paradigm used here is referred to as RLHF, or training a language model in this case-PaLM + RLHF, and fine-tuning it on a dataset containing prompts such as "Explain machine learning to a six-year-old" paired with what human volunteers expect the model to say, for instance "Machine learning is a form of AI.". Those prompts are fed into the fine-tuned model, and then it spits out a lot of responses-all of which the volunteers rank from best to worst. Then that ranking trains a "reward model," which takes the original model's responses and sorts them in order of preference, filtering for the top answers to any particular prompt.
It's an expensive process to collect the training data. And training itself ain't cheap. PaLM is 540 billion parameters in size, "parameters" referring to the parts of the language model learned from the training data. A 2020 study pegged the expenses for developing a text-generating model with only 1.5 billion parameters at as much as $1.6 million. And to train the open-source model Bloom, with 176 billion parameters, it took three months using 384 Nvidia A100 GPUs, a single A100 of which costs thousands of dollars.
Running a trained model of PaLM + RLHF’s size isn’t trivial, either. Bloom requires a dedicated PC with around eight A100 GPUs. Cloud alternatives are pricey, with back-of-the-envelope math finding the cost of running OpenAI’s text-generating GPT-3 — which has around 175 billion parameters — on a single Amazon Web Services instance to be around $87,000 per year.
Sebastian Raschka, an AI researcher, pointed out in a LinkedIn post about PaLM + RLHF that scaling up the dev workflows needed for such may prove tricky themselves. "Even if someone hands you 500 GPUs to train this model, you still have to deal with infrastructure and have a software framework that can handle that, he said. It's obviously possible, but it's a huge effort at the moment (of course, we are developing frameworks to make that simpler, but it's still not trivial, yet).
That is to say that PaLM + RLHF won't overthrow ChatGPT today — unless a very well-funded venture (or person) goes to the trouble of training and making it available publicly.
Meanwhile, quite a few other efforts to replicate ChatGPT are going well. Among them is the one by the research group called CarperAI that will make the very first ready-to-run AI model such as ChatGPT, trained with human feedback in addition to the open AI research organisation EleutherAI and startups Scale AI and Hugging Face.
LAION is the nonprofit that provided the initial dataset used to train Stable Diffusion; it's also the one leading the effort to reproduce ChatGPT using the newest machine learning techniques. Ambitiously, LAION is building "an assistant of the future" — one that does not merely write emails and cover letters but "does meaningful work, uses APIs, dynamically researches information, and much more." It's in its early stages. But there is finally, a few weeks ago, a GitHub page with resources for the project.