Silicon Valley AI company Cerebras released seven open source GPT models to provide an alternative to the tightly controlled and proprietary systems available today.
The royalty free open source GPT models, including the weights and training recipe have been released under the highly permissive Apache 2.0 license by Cerebras, a Silicon Valley based AI infrastructure for AI applications company.
To a certain extent, the seven GPT models are a proof of concept for the Cerebras Andromeda AI supercomputer.
The Cerebras infrastructure allows their customers, like Jasper AI Copywriter, to quickly train their own custom language models.
A Cerebras blog post about the hardware technology noted:
“We trained all Cerebras-GPT models on a 16x CS-2 Cerebras Wafer-Scale Cluster called Andromeda.
The cluster enabled all experiments to be completed quickly, without the traditional distributed systems engineering and model parallel tuning needed on GPU clusters.
Most importantly, it enabled our researchers to focus on the design of the ML instead of the distributed system. We believe the capability to easily train large models is a key enabler for the broad community, so we have made the Cerebras Wafer-Scale Cluster available on the cloud through the Cerebras AI Model Studio.”
Cerebras GPT Models and Transparency
Cerebras cites the concentration of ownership of AI technology to just a few companies as a reason for creating seven open source GPT models.
OpenAI, Meta and Deepmind keep a large amount of information about their systems private and tightly controlled, which limits innovation to whatever the three corporations decide others can do with their data.
Is a closed-source system best for innovation in AI? Or is open source the future?
Cerebras writes:
“For LLMs to be an open and accessible technology, we believe it’s important to have access to state-of-the-art models that are open, reproducible, and royalty free for both research and commercial applications.
To that end, we have trained a family of transformer models using the latest techniques and open datasets that we call Cerebras-GPT.
These models are the first family of GPT models trained using the Chinchilla formula and released via the Apache 2.0 license.”
Thus these seven models are released on Hugging Face and GitHub to encourage more research through open access to AI technology.
These models were trained with Cerebras’ Andromeda AI supercomputer, a process that only took weeks to accomplish.
Cerebras-GPT is fully open and transparent, unlike the latest GPT models from OpenAI (GPT-4), Deepmind and Meta OPT.
OpenAI and Deepmind Chinchilla do not offer licenses to use the models. Meta OPT only offers a non-commercial license.
OpenAI’s GPT-4 has absolutely no transparency about their training data. Did they use Common Crawl data? Did they scrape the Internet and create their own dataset?
OpenAI is keeping this information (and more) secret, which is in contrast to the Cerebras-GPT approach that is fully transparent.
The following is all open and transparent:
- Model architecture
- Training data
- Model weights
- Checkpoints
- Compute-optimal training status (yes)
- License to use: Apache 2.0 License
The seven versions come in 111M, 256M, 590M, 1.3B, 2.7B, 6.7B, and 13B models.
IT was announced:
“In a first among AI hardware companies, Cerebras researchers trained, on the Andromeda AI supercomputer, a series of seven GPT models with 111M, 256M, 590M, 1.3B, 2.7B, 6.7B, and 13B parameters.
Typically a multi-month undertaking, this work was completed in a few weeks thanks to the incredible speed of the Cerebras CS-2 systems that make up Andromeda, and the ability of Cerebras’ weight streaming architecture to eliminate the pain of distributed compute.
These results demonstrate that Cerebras’ systems can train the largest and most complex AI workloads today.
This is the first time a suite of GPT models, trained using state-of-the-art training efficiency techniques, has been made public.
These models are trained to the highest accuracy for a given compute budget (i.e. training efficient using the Chinchilla recipe) so they have lower training time, lower training cost, and use less energy than any existing public models.”
Open Source AI
The Mozilla foundation, makers of open source software Firefox, have started a company called Mozilla.ai to build open source GPT and recommender systems that are trustworthy and respect privacy.
Databricks also recently released an open source GPT Clone called Dolly which aims to democratize “the magic of ChatGPT.”
In addition to those seven Cerebras GPT models, another company, called Nomic AI, released GPT4All, an open source GPT that can run on a laptop.
Today we’re releasing GPT4All, an assistant-style chatbot distilled from 430k GPT-3.5-Turbo outputs that you can run on your laptop. pic.twitter.com/VzvRYPLfoY
— Nomic AI (@nomic_ai) March 28, 2023
The open source AI movement is at a nascent stage but is gaining momentum.
GPT technology is giving birth to massive changes across industries and it’s possible, maybe inevitable, that open source contributions may change the face of the industries driving that change.
If the open source movement keeps advancing at this pace, we may be on the cusp of witnessing a shift in AI innovation that keeps it from concentrating in the hands of a few corporations.
Read the official announcement:
Cerebras Systems Releases Seven New GPT Models Trained on CS-2 Wafer-Scale Systems
Featured image by Shutterstock/Merkushev Vasiliy