Machine learning

What are generative adversarial networks, and how can you use them?

Daniel Taylor

Dec 4, 2022 • 3 min read

Which one is real?

Suppose you want to create a picture that looks like a Picasso painting.

How would you do it?

Well, you could spend 10,000 hours training yourself. Or, you could spend much less time training a computer to do it.

But how would I train a computer to do it?

One method would be generative adversarial networks (or GANs for short).

GANs

The basic architecture of a GAN is quite simple.

GANs consist of two models:

a generator
a discriminator.

The generator has the job of creating a fake Picasso painting.

The discriminator determines whether that painting is a genuine Picasso or a fake Picasso.

We train the two models at the same time. They start simply to become sophisticated with training.

Let's look at each component separately.

Generator

Here's a picture of how a generator works:

You feed the generator a random number (or, more likely, a vector of random numbers), and it uses those to construct its paintings.

Why do we have to feed it with a random number?

This is a good question; the answer concerns what you intend to use the model for later.

Imagine that you removed random number input, so the generator only has to produce a fake Picasso. Well, the generator has no source of randomness within it. It is deterministic. It will learn to draw a picture that looks like an already existing Picasso painting and produce that over and over.

But we don't want that. We want our model to generate original paintings like Picasso painted them. To do that, we need a source of randomness. That's what the random number generator provides.

The generator will have to become sophisticated enough to handle any random number input and turn it into a credible Picasso fake.

Discriminator

It's the discriminator's job to look at paintings and say whether they are real or fake.

Its input is a painting (real or fake), and its output is its answer: is this real or not?

You feed the discriminator with actual Picasso paintings and fake ones from the generator.

To begin with, the fake pictures from the generator will look nothing like a Picasso painting so that the discriminator will have an easy job. But as the generator improves and the images from the generator become more realistic, the discriminator will need to learn new strategies to tell them apart.

The loss function

In this section, I want to discuss the loss function, and demonstrate the tug-of-war between the two models

For those who don't know, this loss function is how the model judges its performance. A high loss means that its predictions are bad, and a low loss means its predictions are good.

Let $z$ represent the random input vector the generator will use to produce the fake Picasso.

Let $G(.)$ represent the generator's output.

Then $G(z)$ is the fake Picasso.

Let $D(.)$ represent the discriminator's prediction.

Then $D(G(z))$ is the discriminator's prediction of the generator's fake Picasso.

Finally, $\text{loss}(D(G(z))$ represents the loss of the discriminator's prediction of the generator's fake Picasso.

Now let's think about what the models want to do.

The discriminator wants to make the correct guess. Therefore it wants to minimise $\text{loss}(D(G(z))$.
The generator wants the discriminator to make the incorrect guess. Therefore it wants to maximise $\text{loss}(D(G(z))$.

Or, to put in a single expression:

$$ \text{min}_D\text{max}_G[\text{loss}(D(G(z))]$$

The tug-of-war between these two functions generates the fake Picasso paintings.

Conclusion

After the training is complete and you want to produce your Picasso, we unhook the discriminator and pass the model a random input vector. Then, as if by magic, a realistic Picasso will appear.

Hopefully, you've enjoyed learning about GANS.

Check out this video for the state-of-the-art.

If you want to learn more about GANS, I recommend the book GANs in Action by Jakub Langr and Vladimir Bok.