Find the Simplest Outline an AI Image Tool Can Make
By: Eryk Salvaggio
Description of work:
Generative AI tools for making images are called diffusion models. Diffusion models start with an image of random pixels — basically, noise, or digital static! Then they shuffle that noise until the pixels match the caption you described in your prompt. They follow specific rules to make specific pictures. Weirdly, they can’t create pictures of visual noise: they are programmed to take noise out!
While training an AI model, they take away details step by step. Each step is “learned” again and again. When all the details are stripped away, only the basic outline remains. When it comes time to generate the image, it starts with these basic outlines first, and then adds details over time.
By playing with ways to try to make “noisy” images — grainy, pixelated, digital static, etc — you can sometimes create something totally unintended with them. It can also be interesting to see what exactly the AI model can combine with noise. Some words won’t come through at all, and others come through in very simple forms. That’s because the model makes images in a series of steps. It starts with really basic shapes first, and then adds more details later. By starting an image with noise, the model gets confused — it’s supposed to take noise out of pictures, not add noise in!
It may not work for you right away, but by playing with noise prompts, you can see how these systems make images. They are still just machines capable of doing things over a sequence of steps. They don’t think creatively about what to do or how to solve a problem. As an artist and a human being, that’s what you do.
Quote: “Generative AI isn’t creative unless a person uses it creatively.”
Assignment:
- Choose a diffusion-based image model, like Midjourney or Stable Diffusion. You can also do this with video generation models like Runway.
- In the prompt window, brainstorm words that describe visual noise. Noise is really anything that gets in the way of what we’re supposed to see in an image. For example, “digital static,” “visual noise,” “motion blur.” Play around with different words until the model stops making images that make sense. (Sometimes the model makes realistic pictures that are completely unrelated to your prompt: why do you think that is?)
- Once you start seeing weird abstract shapes, patterns, or colors, see what else you can add to the prompt. For example, adding “flowers” might introduce faint references to flowers in the noise.
- What prompts work with the noise and which ones don’t? In general, choosing words that refer to very well known, iconic shapes will do best here. That’s because this is what the models have learned over all their steps of training, whereas details of an image are learned last.