Large Language Model (LLM) developers have no concern about open sourcing their algorithms
Large Language Model (LLM) developers such as OpenAI have no concern about open sourcing their algorithms.

Large Language Model (LLM) developers such as OpenAI have no concern about open sourcing their algorithms. 

 

But it is not proprietary data that is their advantage. The data sources are very public collections from Wikipedia, Reddit, Github, open web crawls, etc. And there are open source versions of this data available to just download (see The Pile).

 

For a little while it was not terribly hard for third parties to participate. We at East Agile spent a year training and fine tuning specialized versions of GPT models using the same structure and the same types of data as GPT-2. But we could only afford to make small versions of even GPT-2. But companies like ours were left in the dust on this training mission by GPT-3 and later larger models by the likes of Google and Nvidia and Microsoft.

 

The key moat is access to the computing infrastructure and piles of cash required to train the models. Developing a GPT-3 type model might easily cost $5 million in compute costs each week. Small versions that are only 70 billion parameters instead of 175 billion cost $2.5 million to train once. Maybe by the end of the process, a final model build might cost only $5 million. But that comes after countless iterations. It is hard to imagine a newbie getting it right the very first run. And GPT3.5 layers tons of expensive manual labelling on top of this process. Possibly GPT4 will cost ten times as much. Table stakes in this game are $100 million. But beware. As of just a week ago, OpenAI is now playing with $10 billion in new funding from Microsoft.
 

6205
Share this
Leave a comment
There are no comments about this article, let us know what you think?

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.