Google Built A Trillion Parameter AI Model. 7 Things You Should Know
One of the exciting things about Artificial Intelligence is the steady stream of new accomplishments that we see in the news. Every week, some research institution or company accomplishes something amazing with AI, whether it is translating a long lost language, or building a massive model, the scale of which has never been done before.
But what does it all mean? If I am a business CEO, what impact if any does this have on my business? Is there any way I can leverage it? If I am a teacher, what should I tell my students? Being aware of recent events is always a good thing, but without context it is hard to make sense of them.
In this article, we dissect this specific announcement, and answer seven high level questions you may have about it.
To fully appreciate this announcement, a bit of history is helpful
- The past decade has seen massive advancements in Natural Language Processing (NLP) – the ability for AIs to understand and interact with languages. NLP can be used for everything from chatbots to text summarization. You can find a simple overview of NLP applications in the figure below.
- In 2017, AI technologists developed Transformers with Attention, a method that revolutionized how AIs understand sequences of words. This, coupled with the masses of data available on the internet, has spawned a new series of “large language models” which can complete sentences, answer questions, summarize text, translate languages and so on.
- Every year, these models get larger. Studies have shown that larger models can be more effective than smaller models, but are computationally very expensive.
What makes this announcement special?
It is, to date, one of the largest AI models ever trained (although not the largest possibly – given the recent NLP announcement by China). These announcements indicate that AI technologies are coming closer to making the training of massive models computationally manageable.
What can this model do?
According to the Google research paper on this model, which used a new technique called a Switch Transformer, the model (or model family) has been tested in sentence completion, summarization, language translation, question and answer, and a few other tasks.
Where did it learn from?
This model was taught first by a large section of internet crawled data, which was cleaned to preserve english sentences and remove unacceptable words. It was then given supplemental training for each of the specific tasks, using datasets targeted for those tasks.
What does it know?
This is a very important and difficult question to answer. One of the problems with models like this, that are trained on large sections of the internet, is that it is very difficult to assess what it has learned in a human qualitative sense. For example – has it learned biases? Does it know more about some subjects than others? Does it know western or eastern information?
Does this mean AI is closer to a human brain?
Not really. If you need a measure of comparison, a human brain has around 100 trillion synapses. On one hand, you can argue that at 1.6 trillion parameters, AIs are inching closer. However, it is important to recognize that what these AIs learn and what/how humans learn are still so different that the simple ratio of parameters is not meaningful. Most humans do not have the collected knowledge of the internet. This AI also does not know a vast many things that come easily to the average human brain. They are apples and oranges.
What issues should I be aware of?
Ethicists have many questions and concerns. Some are
- There is a model creation arms race underway, with corporations and countries trying to create better models. Will some countries get left behind?
- Are these models biased? When AIs are trained on such large bodies of open data, it is difficult to assess the biases within the data and evaluate the models themselves.
- What about carbon footprint? In 2019, a group of researchers found that just one training run of a large language model can generate 626,155 lbs of CO2 emissions, equivalent to what 17 americans will generate in their entire lifetime, or what 5 cars will generate in their lifetime. Techniques are being developed to balance AI efficiency and energy consumption, but the cost of large models continues to be a concern.
Can this help my business?
The main takeaway from this announcement should be the sheer potential of techniques like transformers in language applications. It is unlikely that your business needs this particular 1.6 trillion parameter model, but smaller models have shown great value in everything from chatbots to language translation to automated support ticket classification. If your business has any of these use cases, looking into NLP is a very good idea.
Seen on Forbes (Innovation): Article Link