Generative AI models are the steam engine of today's AI revolution.
This is the analogy I use to help guide my mind to not get swept by the raging flood of information today. Thinking of GenAI models this way allows me to find connections between the industries in the past that were greatly impacted by the arrival of the steam engine and the industries of today. It also helps me ask questions about upcoming key moments to come. What would be the equivalent of the internal combustion engine? What about the widespread use of electricity?
Mental shifts
Steam engines brought the mental shift towards the idea that we can control energy transformation as long as we build the right machine. H.G. Wells took this idea even further in his book The Time Machine. Notice in the book that the time machine is operated using a bunch of levers and knobs! Very mechanical much like in their time.
Today, AI researchers have realized and proven that as long as you have data you can train an AI model to take inputs such as text, image, audio and video, and generate new content into any of these modalities.
While researchers, developers and people closely following up on AI developments already believe its potential, this is not true for the general public. The other day, I asked groups of college students if they use ChatGPT, and only half did. The reason is that they don’t need it in their day-to-day life.
I guess this shouldn’t be surprising at all. The effectiveness of ChatGPT largely hinges on the quality of the inputs provided. Getting consistently valuable results requires practice, leading some users to view it as an investment not worth their time.
But this is an opportunity for startups. A chance to infuse additional knowledge and design elements while still leveraging the same engine as ChatGPT to deliver new value. For those unfamiliar, this process is called AI-wrapping. Although there is a negative notion of products that are “simply AI-wrappers”, when done right it can be something big.
I’m sure some of you already benefit from what large language models (LLMs) like ChatGPT and GitHub Copilot can do. I use both and surely it took me from zero to mediocre in Typescript allowing me to build web apps. If we generalize this idea of allowing anyone to jump from zero to mediocre level at anything using AI, how would society change?
Energy then, Information now
For the longest time in human history, muscle power has been the only energy converter available. With humans and animals providing muscle power, most activities in life are bound by the cycle of plant seasons and sunlight which powers the muscles, our food-to-movement converter.
This bottleneck was unlocked during the Industrial Revolution with the arrival of a heat-to-movement converter - the steam engine. Transforming our society in ways we never went back like having a standard time.
GenAI excels at information conversion. That is why the conventional way of writing what an AI model does is the pattern InputModality-to-OutputModality. Among the most popular ones today are speech-to-text (Whisper), text-to-image (StableDiffusion), text-to-speech (ElevenLabs) and image-to-video (RunwayML Gen-2).
But before GenAI, the human brain was the only general information converter available. A translator's job can be described as a text-to-text conversion. A customer support’s job is a questions-to-suggestions conversion. Both jobs are increasingly being augmented or even replaced by AI, signifying a shift in how we approach and execute these tasks.
However, it's important to recognize the limitations of GenAI. It is still far from replacing human creativity; experts in their respective fields are still valuable in making new things and solving novel problems.
GenAI is well suited for repetitive, boring and well-defined tasks: Data Entry, Document Preparation, Scheduling, Onboarding, etc. Look onto tasks that fit this description around you and maybe you’ll build the next Salesforce.
It is still early
Much like a steam engine is not a train, it is what made trains move. GenAI is not the solution itself but the engine that powers innovative solutions across industries. We are still at the very early stage of an era but with an exponentially increasing pace of development.
One such development is called Retrieval Augmented Generation (RAG). Simply put, it rewrites the user’s prompt to an LLM in a way that includes relevant pieces of information from the user’s own data. Imagine asking a stranger to cook a dish for you and giving your personal recipe, it will guide them to your taste instead of making something generic.
Is RAG gonna lead us to the internal combustion engine moment? Time will tell. What we know and experience is that it works. Several products out there now will not be possible with it.
If you ask ChatGPT what are the top 3 professions to be impacted by AI the answer will be: Writing and Content Creation, Legal Profession and Customer Service. Indeed, multiple startups that appeared recently are based on these 3 fields. These straightforward ways of using AI are akin to building steam-powered carts and pumps after learning about steam engines. There are still too many things left to be built, most of which have not even crossed our imagination.
Unlike steam engines where the experimentations for potential applications require manpower and raw materials, exploring potential AI applications only requires compute resources like your laptop. With free usage credits on most platforms available, getting started has never been easier.
New ways of working will come because of GenAI and we as a society will iterate towards those that we feel comfortable with. Have you created a new workflow recently that involves GenAI?That could be something other people can benefit from so productionizing your workflow through code could be a startup opportunity. I notice this more in the image generation community: people create their own workflow to modify the output images of GenAI making even better images. Much of the non-AI-related process seems to be the time-consuming aspects of their workflow.
Now to wrap things up with one final question: How can we find startup ideas in this era of rapid AI evolution? Paul Graham says the word for it should be “notice” - by having a prepared mind we should notice gaps and build them.