Language is a complex thing, but even still there have been attempts to make computational sense of it for almost a century. There were workable speech-to-text, text-to-speech and language-translation systems in the 1970s that have been steadily improving ever since. The success of these systems has been constrained by both hardware (the device, or PC you want to do the translation on) and software (the sophistication of the program doing the translation).
When we had Cloud-level hardware and community-level content generation, we fairly predictably got Google Translate - a pretty incredible advance in translation which still didn’t make the news. So what’s different this time? Now everyone is talking about ChatGPT. Has something substantial actually happened, or are we just experiencing a meme that got out?
The short answer is ‘yes’, something substantial has happened. In 2017, a bunch of Google and University of Toronto researchers published a paper which proposed a new kind of language learning model, which they called a Transformer. This model was incremental to previous models, but had 3 interesting differences:
Really, it’s this third one that’s the game changer. Because it enabled parallel training, it could leverage GPUs. And the world’s GPUs are pretty amazing right now because of 1) the boom in home computer gaming a couple of decades ago, and 2) the boom in compute-intensive applications for GPUs such as crypto, medicine, weather etc in the last decade. So you put a good software improvement together with GPUs, and something fairly surprising happened.
Okay, so that paper was out 5 years ago. Why now?
In 2018, OpenAI developed the first GPT, which was based on the 2017 paper. They trained the model with 5GB of text (a few thousand books) on an 8-GPU machine (basically a home-PC setup). The translation results looked promising on a variety of benchmarks, so they decided to go further with it.
So in 2019, they gave it 10x the amount of data and 10x the “parameters” and developed GPT-2 and that’s when they really knew they had a thing. This thing was so impressive that it actually scared them - they worried so much about the potential misuses of the technology that they were very careful about who they shared it with. What was developed to be good at translating between languages, became scarily good at much more than that. A lot of OpenAI’s discussion was channelled into ‘ethics’ and ‘safeguards’ and hand wringing over what could go wrong if the unwashed masses got hold of it.
There’s nothing we fear more than our own Reflection. We scream at the monsters within us, hidden deep within our hearts. We run and hide from the terrors all around us- the different mirrors that we see.
In 2020, they took it to a whole new level. 100x the parameters (from 1.5B to 175B), and produced GPT-3. They then produced GPT-3.5, which was essentially a whole bunch of guardrails and safety features attempting to address the concerns for harm that the technology might produce. It’s GPT-3.5 which OpenAI put a chatbot interface on and called ChatGPT, launched in November 2022. And it’s having the general public interacting with this impressive technology that has driven the news cycle you’re experiencing now.
Do we have Skynet yet? No.
Does ChatGPT signal that we are in any way closer to having Skynet? No.
Is Skynet something I will have to contend with in my lifetime? No.
Artificial Intelligence as a concept has been around since forever, but modern computer-based implementations have been kicking around since the 1950s. Correspondingly, predictions of Artificial General Intelligence (AGI), like [insert your favourite fictional android here] have been about as regular and accurate as those for the End of the World.
What’s been shocking - given the investment and decades involved - is just how much progress hasn’t been made in AGI. What we are witnessing is the successful application of AI to another category of specific problems - this time language processing - similar to when computers started being better than humans at chess.
This is going to date me, but when I was in high school I did work experience at an advertising agency. They showed me a room where a guy would composite magazine ads for photography. What I mean is, he would physically assemble pieces of paper that he had cut out - the background, individual photos, type etc - and lay them out on a board with a camera fixed overhead that would take a photo of the finished composition, and that photo would go into a magazine as an ad. That seems incredibly archaic now that we have digital compositing on computers… but in my lifetime, people were paid money to do it manually.
Computers, and then the Internet, were revolutions in labour-saving. If you sent someone an email, a postman didn’t have to physically put an envelope in someone’s letterbox… not to mention all the other people and industries involved in the letter manufacturing and delivery process.
So Transformers - and actually a bunch of other machine-learning models - are going to spawn several generations of new labour-saving applications that will push us further into a new future where we’ll reminisce: “Remember when we used to pay people to do xyz?”. I don’t know what all those applications will be, but we’re starting to see some now, and many, many more will be discovered over the coming years.
When Google came out, it was a substantially better search engine than anything else that existed. Sergey and Larry had developed an algorithm that scaled on clusters of cheap desktop PCs, rather than the big expensive servers everyone else was using. The more cheap desktops they could buy, the better it was.
ChatGPT is similar, in that it leverages commodity hardware. But dissimilarly, the tech underlying ChatGPT is well known and understood ‘out there’, whereas Google’s search algorithm was a tightly guarded secret. So, we should expect to see a plethora of competing Transformer-based startups, and a whole bunch of application categories coming into existence and many dying away, possibly including ChatGPT itself.
When Google’s Search started to dominate, it wasn’t clear what the sustainably profitable business model would be. The obvious idea was display advertising, others speculated about clipping-the-ticket on ecommerce transactions… but few people foresaw that actually it would be yellowpages that would provide the rivers of gold. Similarly, noone really knows how OpenAI gets its (or more correctly, Microsoft’s) money back from what is an expensive service to run - possibly >$100k per day.
That business model might not yet exist, and even if it does it could take several years to discover and prove. My guess is that ChatGPT’s ‘chatbot as an interface for internet search’ is not the ultimate money spinner, but that there’ll be a raft of specialist labour-saving products that end up themselves being the new rivers of gold.
A preliminary chat is always welcome. After all, great coffee and red wine are amongst our favourite things.