GPT-3 and the challenge for startups
OpenAI’s release of the associated API has now resulted in the first wave of GPT-3 powered services, many of which are quite useful and seem to have good initial traction. Examples include services that generate fantasy game scenarios, marketing copy, and help you better process and understand customer requests. And Github, in partnership with OpenAI, just announced a code-completion/assistance service that is based on a specialized descendant of GPT-3.
We’re excited about the possibilities to build on these “large language models”, but we also think there are important questions that startup founders will need to think through given the unique business dynamics at play. In other words, state-of-the-art natural language processing (NLP) models will play a very important role in our future, but are we at the stage where they can serve as the foundation for startups that can become large businesses?
For context, it’s important to note that large language models, of which GPT-3 is an example, for now have two important characteristics:
They require very specialized AI expertise to build, train and maintain
Training and running them requires very large amounts of compute power (and capital)
Given these characteristics, only a very small number of organizations are capable of building, hosting and making them available for the foreseeable future. In fact, OpenAI (and their partner, Microsoft) is the only entity that has made a model like this publicly available so far.
So what are the key questions for startups building on GPT-3? Given that, at least in this initial wave, GPT-3 is arguably the core IP behind most of these services, it’s important to ask the following questions:
How do you differentiate?
How do you build a sustainable moat?
What about platform risk?
Of course, these are not new questions—in fact, they are business 101. But we think the answers are even more important than normal given the dynamics at play.
Differentiation and Building a Moat
Currently, GPT-3 is a black box: you give it a “prompt” and it sends back a “completion”. The prompt is both a way to tune the model towards your application and an explicit way to tell it what you want. Actual fine-tuning functionality (giving it labeled data to further train the model for your specific application) may be coming soon. Given this design, there are three possible avenues for differentiation:
Prompt and fine-tuning optimization - this means gathering the necessary data and structuring it in a way that gives GPT-3 the information it needs to work its magic. This is, in effect, the only way today to “program” GPT-3 and influence its results. Is there enough opportunity to build out IP “special sauce” in prompt generation and/or fine-tuning data, or is this something that can be trivially replicated? If the former, what forms will this take?
Completion optimization - Once GPT-3 returns its results it’s possible to transform them further. Is there important IP to develop here? Perhaps. One can imagine scenarios where GPT-3 generates new content or does a transformation (e.g., summarization) that sets the stage for other important, perhaps more specialized, NLP-based tasks. Although, could GPT-3 (or a more capable successor) have also done the second task if it were included in the prompt?
User experience - We think there is an opportunity for significant innovation in the UX around these experiences, especially when GPT-3 is used to generate new content. That’s because the most interesting services in the short and medium term are likely to be those that closely combine and coordinate the work of humans and machines. For a simple example, imagine a highly interactive and collaborative content generation tool allowing users to guide creation, seamlessly weaving together (synchronously or asynchronously, short or long) elements created by human users and the language model. This could give rise to a whole new wave of creative tools - not just for text, but also eventually audio, images and video.
Do these three areas give startups enough degrees of freedom to innovate and build something that is highly differentiated when the core work is being done by a model like GPT-3? We are somewhat wary of prompt and completion optimization as avenues to differentiate, but there is potential. We are more bullish on UX leading to longer term-differentiation, but finding the right interaction models will take some time.
This is the aspect of GPT-3 that gives us the most pause today, for both tactical and strategic reasons. On the tactical side, GPT-3 is still very much a beta product, and founders will have to deal with changing specs, APIs and evolving capabilities for some time, not to mention the open question of built-in biases or memorized content. It may be possible to build a solid service on a rapidly shifting base, but it won’t be easy.
On the strategic side, there are several concerns. First, for the foreseeable future there will be very few suppliers of GPT-3 like capabilities, and so startups will be at the mercy of a handful of large providers and their whims. Secondly, building a long-term, sustainable platform is not easy and few companies have achieved that goal. On the positive side, Microsoft, a key OpenAI partner, is one of the few companies that has been a good long-term platform steward. But all platforms, no matter how well-managed or well-intentioned, eventually face the temptation of competing with the successful applications built on them. This may be a longer-term concern, but is still something that is important to keep in mind.
In conclusion, we’re excited about the possibilities. NLP and large language models are poised to change how we interact with machines and greatly expand their capabilities. There will undoubtedly be important businesses that emerge, perhaps even from this early, experimental era.
If you are building something with large language models and have interesting ideas around differentiation and building a business for the long term, we’d love to talk to you!