During the middle of the pandemic last year, an artificial intelligence (AI) lab in San Francisco called OpenAI revealed a technology that has been in the making for some time. This new system, called Generative Pre-trained Transformer 3 (GPT-3), learnt the nuances of natural language over several months—language as spoken and written by humans. It analysed thousands of digital books, and nearly a trillion words posted on the rest of the internet. GPT-3 is the output of several years of work done by the world’s leading AI labs, including OpenAI, which is an independent organization backed by $1 billion in funding from Microsoft.
At Google, a system called Bert (short for Bidirectional Encoder Representations from Transformers) was also trained on a large selection of online words. It can guess missing words in millions of sentences, such as “I am going to see a man about a….” or “I am going to… a man about a dog.” These systems are called natural language models and can manage many interfaces, from chatbots and voice commands to Amazon’s Alexa or Google. But GPT -3, which learnt language from a far larger set of online text than previous models, opens up many more possibilities.
GPT-3 is what AI scientists call a ‘neural network’, which is a mathematical system loosely modelled on the web of neurons in the brain. As can be expected of such complexity, there is more than a single mathematical model. The two most widely used are recursive neural networks, which develop a memory pattern, and convolutional neural networks, which develop spatial reasoning. The first is used for tasks such as language translation, and the second for tasks that involve image processing. These neural networks use enormous amounts of computing power, as do other AI neural network models that help with ‘deep learning’. Bert and GPT -3 are classified as neural networks that operate on language.
But, unlike Bert, GPT -3 was trained on vastly more data. The first hope was that it could predict the next word in a sentence, rather than just one word anywhere in it, and that it could keep going if you typed a few words—by completing many sentences, and even several paragraphs, based on your original thoughts.
During its long training, GPT-3 parsed out more than 175 billion parameters—mathematical representations of language patterns—in the vast set of books, Wikipedia articles and other online texts that were in its syllabus for study. These parameters amount to a map of language: i.e., a mathematical description of the way we piece words together.
It can summarize email, generate tweets, write poetry, answer trivia, and translate languages. It does all of these with minimal manual help or direction. For many of us who watch developments in AI, GPT-3 represents an unexpected leap toward machines that can understand the ins and outs of language.
For most of today’s ‘machine learning’ or ‘deep learning’ programs, including image recognition tools for self-driving vehicles, we think of thousands of people in India or Sri Lanka labelling every picture so that an AI program can refer to those labels each time it attempts a task, such as recognizing a traffic sign accurately, or a pedestrian instead of a bicyclist.
Unlike these, GPT-3 can be primed for specific tasks using just a few examples, as opposed to the thousands of examples and several hours of additional training required by its ‘deep learning’ predecessors. Computer scientists call this “few-shot learning” and believe that GPT-3 is the first real example of what could be a powerful change in the way humankind trains its machines. It is not just the beginning of a new era for speech recognition programs like Amazon’s Alexa or Apple’s Siri. The real surprise from GPT-3 is that Systems architects have been able to provide just a few simple instructions to let it even write its own computer programs.
At a basic level, computer programs are English-like commands given to a computer in a logical sequence such that the commands produce a certain outcome after a computer acts on them. As such, GPT-3’s mathematical descriptions of the way we piece English together works whether we are writing columns or coding software programs. Using these maps, GPT-3 can perform tasks it was not originally built to do.
So far, OpenAI has shared GPT-3 with a small number of testers, since many kinks, including biases and profanities, need to be sorted out. That said, Open AI scientists have claimed that “any layperson can take this model and provide these examples in about five minutes and get useful behaviour out of it”. Unsurprisingly, Microsoft has licensed exclusive use of the source code, but others can use a public interface to generate output.
I wrote some weeks ago that we are bombarded in India with ads for courses that promise to teach children coding skills. Everyone seems to think that the pandemic has shifted the world firmly towards digitization, and that the only businesses of the future would be technology-enabled ones, or at least those that can quickly pivot themselves to a digital, remote delivery model. Parents seem intent on making computer programmers of their kids.
It’s evident that preying on the gullibility of worried parents is not difficult. But the abilities of GPT-3 demonstrate that we are probably training our children in skills that will be long obsolete before their work careers are done.
Siddharth Pai is founder of Siana Capital, a venture fund management company focused on deep science and tech in India
Picture from freepik.com