The Art of AI: Crafting Masterpieces with Prompt Engineering and LLMs
Jose Nicholas Francisco
I heard the word “Prompt Engineering” for the first time about a month after ChatGPT came out. Initially, I thought it was a joke. After all, how could you “engineer” a prompt? In my mind, engineering involves striking a nail with a hammer to ensure that a bridge remains stationary in the face of a hurricane. In its most sedentary form, engineering should entail Python frameworks, and Redis, and firewalls, and Kubernetes.
How could you possibly be an engineer when your only tool is natural language?
Well, it turns out that natural language is a hell of a tool. And I was naive to think any less of it. Shame on me, honestly. After all, I’m a writer. From Thomas Paine’s Common Sense to the Treaty of Paris (any one of them), words have the ability to incite revolutions and end wars.
And today, we’ll be using them to squeeze the most juice out of an AI as we can.
The pen is mightier than the sword… but what about hammers and nails?
To understand prompt engineering, we first need to view words as a tool. That’s simple enough to digest at a high level, but when it comes to applying this mentality in the field, the details become rather nuanced.
Let’s be honest. We only talk to AI for one reason: To make it generate some desired output. Whether it’s art or code, the goal of communicating with any model (at this point in time, at least) is to make it produce. And prompts are the only way we can control this output without cracking the AI open and modifying its weights.
However, the art and text that AI produces can only be as good as the prompt that the user input. Or, more specifically, the AI’s output is only as good as its prompter. That’s you.
Much like a carpenter, a prompt engineer’s skill manifests itself in the quality of their work. A mediocre prompt engineer creates mediocre art/writing/code. Meanwhile, a great one creates great ones. Take a look at the image below, for example. The image on the left was created with Stable Diffusion, with the mere prompt of “A red bird flying through the sky.”
Sure, the resulting image is indeed a red bird flying through the sky. But the actual image itself is rather lackluster. The bird looks realistic while the sky looks cartoony. The bird’s tail seems to be positioned as if it was a pair of wings. And the bird’s actual wings are pressed against its body as if it were grounded and stationary.
With a better word-choice, we can achieve the image on the right. A bird with a single tail and two clearly defined wings flapping through the air. The sky and the bird itself actually match each other, this time—the wispy brush strokes of the clouds complementing the similar brush strokes of the bird’s feathers. And, of course, our well-engineered bird rocks a beautiful set of head feathers.