LLMs 101: Everything You Need to Know About Large Language Models
Jose Nicholas Francisco
At this point, many of us have heard about ChatGPT, Bard, GPT-4 and other “Large Language Models,” often abbreviated as LLMs. But what exactly are these magical, chatty machines? What makes them “Large”? And how is it possible for computers—who, in reality, only speak in 0’s and 1’s—learn to articulate themselves like you and me?
Part 1: NLP and Word Vectorization
Well, let’s start from the very beginning. First, we have to understand how computers can make sense of words. If, for example, I say the word “fish” to a human like you, then the idea of a fish pops into your mind. But the concept of a fish is complex. You could be imagining a goldfish in a tank, a great white shark, or a delicious cut of salmon on a bed of rice.
Understanding the nuances of the idea of a “fish” isn’t tough for dynamic humans. But for a machine that only understands numbers, the task of understanding becomes a bit more difficult.
Luckily, a bunch of scientists, linguists, and programmers banded together to come up with a solution. That solution is two words long: Word Vectorization.
Without getting into the nitty-gritty of calculation, here’s the punchline: Because computers only understand numbers, we end up transforming the word “fish” into a very, very, very long list of meticulously calculated numbers. This list can contain thousands upon thousands of entries.